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(57) Abstract 



Novel packaging cell lines useful for generating viral accessory protein independent HIV-derived retroviral vector particles, methods 
of constructing such packaging cell lines and methods of using the viral accessory protein independent HIV-derived retroviral vector 
particles are disclosed. 
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PACKAGING CELL LINES FOR HIV-DERIVED RETROVIRAL VECTOR PARTICLES 

BACKGROUND OF THE INVENTION 

Retroviral vectors based on lentiviruses, such as human immunodeficiency 
viruses (HIV), can infect nondiyiding cells, and integration of proviral DNA occurs 
5 without the need for cell division. These properties make lentiviruses attractive for gene 
transfer into nondividing cells, such as hepatocytes, myofibers, hematopoietic stem 
cells, and neurons. 

However, the use of lentivirus vectors, particularly HIV vectors, particularly for 
gene therapy, is hampered by concern over their safety. Thus, a need for the 
10 development of lentivirus vectors, particularly HIV vectors, with improved safety, 
particularly for gene therapy, exists. 

SUMMARY OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HIV -derived, 

15 retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a 
preferred embodiment, the packaging cell lines of the present invention are stable 

20 packaging cell lines. 

In one embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpoL wherein said coding sequence has 
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been codon optimized by mutagenisis to improve expressior. of the lentivirus gagpol 
proteins. 

In second embodiment of the invention, packaging cell lines for producing a ^ 
viral accessory protein independent ientivirus-derived retroviral vector particles 
. comprise (a) a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence m 
the cell which comprises a coding sequence for lentivirus gagpoL wherein said codmg 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a second retroviral nucleotide sequence in the cell 
which comprises the coding sequence for a heterologous envelope protem. 
0 In a tWrd embodiment of the invention, packagmg cell lines for producing a vtral 

accessory protein independent Ientivirus-derived retroviral vector particles compnse (a) 
a cell (e o mammalian cell); (b) a first retroviral nucleotide sequence in the cell whtch 
comprisls a coding sequence for lentivirus gagpol, wherein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
1 5 proteins: (c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and (d) a third retroviral 
nucleotide sequence which comprises a DNA sequence of interest and lentivirus cts- 
actine sequences required for packaging, reverse transcnption and integration. 

^ In a fourth embodiment of the invention, packaging cell lines for producmg a 
.0 viralaccessoryproteinindependentlenttvirus-derivedretroviralvectorparttcles 

comprise (a) a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence m the 
cell which comprises a coding sequence for lentivirus gagpol, wherein satd codmg 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a retroviral nucleotide sequence which compnses a 
25 DNA sequence of interest and lentivirus cis-acting sequences required for packagmg, 
reverse transcription and integration. 

in a fifth embodiment of the invention, packaging cell lines for producing a vu:al 
accessory protein independent HIV-derived retroviral vector particles coniprise (a) a cell 
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(e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins. 

In sixth embodiment of the invention, packaging cell lines for producing a viral 
5 accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
(e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 
(c) a second retroviral nucleotide sequence in the cell which comprises the coding 

1 0 sequence for a heterologous envelope protein. 

In a seventh embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 

1 5 codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; (c) 
a second retroviral nucleotide sequence in the cell which comprises the coding sequence 
for a heterologous envelope protein; and (d) a third retroviral nucleotide sequence which 
comprises a DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

20 In a eighth embodiment of the invention, packaging cell lines for producing a 

viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 

25 (c) a retroviral nucleotide sequence which comprises a DNA sequence of interest and 
HIV cis-acting sequences required for packaging, reverse transcription and integration. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a retroviral nucleotide sequence which comprises a codon optimized gag 
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coding sequence and (2) a retroviral nucleotide sequence which comprises a codon 
optimized pol coding sequence, in place of the retroviral nucleotide sequence which 
comprises a codon optimized gagpol coding sequence. 

In a particular embodiment, the heterologous envelope protein is the G 
5 glycoprotein of vesicular stomatitis virus (VSV G). In another embodiment, the 
heterologous envelope protein is the amphotropic envelope of the Moloney leukemia 
virus (MLV). 

Cell lines for producing a viral accessory protein independent lentivirus-denved 
retroviral vector particles are produced by transfecting host cells (e.g., mammalian host 
1 0 cells) with a plasmid comprising a DNA sequence which encodes lentivirus gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the lentivirus gagpol proteins. Depending upon the particular 
cell line being produced, the host cells are also co-transfected with a plasmid 
comprising a DNA sequence which encodes a heterologous envelope protein, or a 
15 plasmid comprising a DNA sequence of interest and lentivirus cis-acting sequences 
required for packaging, reverse transcription and integration, or both of these plasmtds. 
Alternatively, host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the plasmid 
20 comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
Cell lines for producing a viral accessory protein independent HlV-denved 
retroviral vector particles are produced by co-transfecting host cells (e.g., mammalian 
host cells) with a plasmid comprising a DNA sequence which encodes fflV gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagemsis to 
25 improve expression of the HIV gagpol proteins. Depending upon the particular cell line 
being produced, the host cells are also co-transfected with a plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein, or a plasmid compnsing a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
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transcription and integration, or both of these plasmids. Alternatively, host cells are 
transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
HIV gag protein and a plasmid comprising a codon optimized DNA sequence encoding 
a HIV pol protein, in place of the plasmid comprising a codon optimized DNA sequence 
5 encoding both HIV gagpol proteins. 

The present invention also relates to methods of producing viral accessory 
protein independent lentivirus-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes lentivirus gcigpol proteins, wherein said DNA sequence 

10 has been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and lentivirus cis-acting sequences required for packaging, reverse transcription 
and integration. Alternatively, host cells are transfected with a plasmid comprising a 
'15 codon optimized DNA sequence encoding a lentivirus gag protein and a plasmid 

comprising a codon optimized DNA sequence encoding a lentivirus pol protein, in place 
of the first plasmid comprising a codon optimized DNA sequence encoding both 
lentivirus gagpol proteins. 

In a particular embodiment, the invention relates to methods of producing viral 

20 accessory protein independent HIV-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes HIV gagpol proteins, wherein said DNA sequence has 
been codon optimized by mutagenisis to improve expression of the HIV gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 

25 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcription and 
integration. Alternatively, host cells are transfected with a plasmid comprising a codon 
optimized DNA sequence encoding a HIV gag protein and a plasmid comprising a 
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codon opd^ized DN A s=,.=n« encoaing a HIV po. protein, in place of *e fct 
p,a.„ndccmpmi.gaco6o„optoiz^ONAs=,uenceencodin6ta*H.Vgagpol 

""'""'The pre^en. invention also relates to viral accessory protein-independent - 
5 ,e»viral panicles produced by or obtainable by (obtained by, the methods descnb^ 

The present invention tnrther relates to isolated DN A encoding a codon 
„p.n..dlen,ivir.3,»s.»UsolatedDNAenc„dlnsthe.»«cod,„grcgiono^.^^^^^^ 

op«.dle„.ivin..»Spo;. and isolated DNAencodlng.he,«;eod.ngreg.o„ot 
,„ L„p,in,i.dlent,v.ms.»..o<. . a particular e—nt, the present — 

„,are. to isolated DNA encoding a eodon optWzed HlV .»gpo, tsoiated DNA 
encodh>gthe,„.codingre,ion„facodon„ptin..edHIV.<.^a„d,solatedONA 

encodingthepo/codm6r.gionofacodonoptimizedHIVg.gp<.l. 

' Lplc^agingcelllinesandvlraiparticlesofthepre^ntinventionc. e^^^ 

,3 fo, gene therapy or gene replacement with In-proved s^ety. The paci.ag.g « 
and Viral particles of the preseht tavenUon can also be used in development and 
rlcol of vaccines, «td in production of biochemical reagents. Gene therapy 
::!:;oducedwi.hthccelll.nesofthep^ntlnventiona,ee.pectedtobe valuable 

medical therapeutics. 

BPTFF DESCRIPTION OF THE DRAWINGS 
" ' ",!^Usaschen.aticdlagra.ofanexpress.oncasse„econ.a.in.d»codon 

optimized ,a^: genes. The DNA was constnKted in nrultiple segn^ents, whcb ^ 
opnmtzedggp ,„,,3 3,3,ABCandD)andHIN.Restrictions,<esus=dto 
indicated at the top as l/j,2/i, J/J (A, D,v 

assemble .he cloned segments are Seated above the lcilobasepa.r (Kb, rul r. Below 
,3 n .re multiple feanrres showing the location of the human cy.omeg.ov,™s 
Xromo.er,humanbe.aglobinse,uences(Bg,obin,,mRNAsc,uen« 
ni.n,s,nUonicse,uence,,.he...a„dHoP-ading.ames,th=.nd,v,dual 
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proteolytic fragment coding sequences (pl7_MA, p24_CA, p7, p6, PR, p51_RT, 
RNaseH and integrase (IN)) and each synthetic oligonucleotide used in the assembly 
process (multiple adjacent open arrows). 

Figure 2 is a table which depicts codon usage frequencies in genes which are "~ 

5 highly expressed and in the codon optimized gagpol open reading frame of the HIV 
packaging construct described herein. 

Figure 3 is a schematic representation of the HIV provirus and a three-plasmid 
expression system used for generating a pseudotyped HIV-based vector by transient 
transfection as described inNaldini etaL, Science, 272:263-267 (1996). 

10 Figure 4 is a list of some characteristics relating to the HIV Rev protein. 

Figure 5 is a list of some points relating to codon optimization of HIV gagpol. 
Figure 6 is a partial DNA sequence of HIV gag (SEQ ID NO: 1), shoving 
inactivation of inhibitory sequences as described in Schwartz, S. et aL, J. Virol, 
56(12):7176-7182(1992). 

1 5 Figure 7 a plot of the %{G^C) content of wildtype HIV gagpol sequences and 

theoretically codon optimized HIV gagpol sequences. The percent of bases, either G or 
C, was calculated for a 30 nucleotide moving window for the entire length of the gagpol 
gene, and the value plotted versus nucleotide position. Diamonds = HIV gagpol 
sequences; squares = full optimal back-translation for ga^ open reading frame; 

20 triangles = full optimal back-translation for pol open reading frame; CO == codon 
optimized. 

Figures 8A-8E depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the gag coding region of a wildtype HIV gagpol and a codon 
optimized HIV gagpol. "NL4-3 genbank.SEQ" indicates the nucleotide sequence (SEQ 
25 ID N0:2) and predicted amino acid sequence (SEQ ID N0:3) for the gag coding region 
of a wildtype HIV gagpol "pHDMHgpm2.seq" indicates the nucleotide sequence (SEQ 
ID N0:4) and predicted amino acid sequence (SEQ ID N0:5) for the gag coding region 
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of a codon optimized HIV gagpol. The "NL4-3 genbank.SEQ" sequences are pubhcly 
available at the NIH GenBank sequence repository (Accesssion No. M19921). 

Figures 9A-9L depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for thepo/ coding region of a wildtype fflV gagpol and a codoir 
5 optimized fflV gagpol. "NL4-3 genbank.SEQ" indicates a nucleotide sequence (SEQ 
ID N0:6) and a predicted amino acid sequence (SEQ ID N0:7) for the pol coding 
region of a wildtype HIV gagpol available in the NIH GenBank sequence repository 
(Accesssion No. M19921). The nucleotide and amino acid sequences for the pol coding 
region available in the GenBank sequence repository contain two sequence errors, 
10 which are indicated in Figures 9A-9L with shading. "pNL4-3.seq" indicates the correct 
nucleotide sequence (SEQ ID N0:8) and predicted amino acid sequence (SEQ ID 
NO-9) for thepo/ coding region of a wildtype HIV gagpol. '•pHDMHgpm2.seq" 
indicates the nucleotide sequence (SEQ ID NO.IO) and predicted amino acid sequence 
(SEQ ID NO- 1 1) for the pol coding region of a codon optimized HIV gagpol 
15 Figures lOA-lOD depict the DNA sequence (SEQ ID N0;12) for pHDMHgpm2. 

The CMV enhancer/promoter is at nucleotides 97 to 679, human betaglobin sequences 
(Bglobin) are at nucleotides 761 to 864, 865 to 1303 and 5710 to 6469 (end of Bglobm 
is at nucleotdes 6445 to 6469), mRNA sequences are at nucleotides 680 to 778 and 125. 
to 59^1 SV40 origin of replication is at nucleotides 8796 to 8908, beta-lactamase (bla) 
20 coding region is at nucleotides 6709 to 7569. intron sequences are at nucleotides 779 to 
1254 the codon optimized gagcoding region is at nucleotides 1318to2820, the codon 

optimizedpo/ coding region is at nucleotides 2619 to 5624 and the poly A site is at 
nucleotides 5897 to 5921. 

Figure 1 1 is a circular map of plasmid pHDMHgpm2. 

25 DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HIV-derived, 



wo 00/15819 



PCT/US99/2067S 



-9- 

retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a ~~ 
5 particular embodiment, the packaging cell lines of the present invention are stable 
packaging cell lines. 

The cell lines are engineered to express the lentivirus proteins necessary for 
virus particle formation (gagpol proteins), without containing DNA sequences from 
lentivirus accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev response 

10 element (RRE)). Additionally, no viral sequences (such as cis-acting elements termed 
constitutive transport elements (CTEs)) will be expressed as RNA of any kind. DNA 
sequences for lentivirus gagpol are codon optimized by extensively mutagenizing the 
sequences to improve expression and to reduce the risk of recombination between 
transfer vector sequences and gagpol messenger RNA. This greatly improves the safety 

1 5 of virus preparations generated from these cell lines. In a particular embodiment, the 
DNA sequences for lentivirus gagpol are not codon optimized in the overlap region 
between the gag and pol sequences and in cis-acting signals necessary for translation of 
pol. 

Examples of lentiviruses include human immunodeficiency viruses (e.g., HIV-U 
20 HIV-2, HIV-3), bovine lentiviruses (e.g., bovine immunodeficiency viruses, bovine 
immunodeficiency-like viruses, Jembrana disease viruses), equine lentiviruses (e.g., 
equine infectious anemia viruses), feline lentiviruses (e.g., feline immunodeficiency 
viruses, panther lentiviruses, puma lentiviruses), ovine/caprine lentiviruses (e.g., 
Brazilian caprine lentiviruses, caprine arthritis-encephalitis viruses, Maedi-Visna 
25 viruses, Maedi-Visna-like viruses, Maedi-Visna-related viruses, ovine lentiviruses, 
Visna lentiviruses). Simian AIDS retroviruses (e.g., human T-cell lymphotropic virus 
type 4), simian immunodeficiency viruses, simian-human immunodeficiency viruses, 
human lymphotrophic viruses (e.g., type III), simian T-cell lymphotrophic viruses. 



wo 00/15819 



PCT/US99/2067S 



-10- 



In another embodiment, cell lines are engineered to express the HIV protems 
necessary for virus particle formation (gagpol proteins), without containing DNA 
sequences from HIV accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev 
response element (RRE)). Additionally, no viral sequences (such as cis-acting elements 
5 termed constitutive transport elements (CTCs)) will be expressed as RNA of any land. 
DNA sequences for a HIV gagp«/ are codon optimized by mutagenesis to .mprove 
expressionand to reduce the risk ofrecomb^nationbetweentransfervector sequences 

andgagpolmesseugerRNA. In a particular embodiment, the DNA sequences for HIV 
,aspol are not codon optimized in the overlap region between the gag andpo/ 
10 sequences and in cis-actmg signals necessary for translation of pol. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (I) a nucleotide sequence which comprises a codon optimized gag codxng 
sequence and (2) a nucleotide sequence which comprises a codon optimized pol codmg 
sequence, in place of the nucleotide sequence which comprises a codon optim^ed 
1 5 gagpol coding sequence. In this embodiment, the gag and pol coding sequences can be 
completely codon optimized 

Benefit of <hc present invento include the removal of polenually tannful 
lenlivi^s ^ p™.e- o.he, viral se,u=„ces. and ,h= rednorion of .he risk of 

recombination to produce replication competent virus. 
,0 Packaging ceU lines for producing a viral accessory protein independent 

lentivirus-derived retroviral vector paxticles comprise a mammalian ceU and a retrovrral 
n„c,eotidese„»n=ecomp.isingacodingse<,ucnccfo,alen.ivi™s.»^»/whichhas 

been codon optimized. In a particular embodiment the packaging cell lines ftrter 
comprise a retroviral nucleotide sequence comprising a coding sequence for a 
25 heterologous envelope protein. In a second embodiment, the packaging cell hues 
further comprise a retroviral nucleotide sequence comprising a coding sequence tor a 
heterologous envelope protein and a retroviral nucleotide sequence which compnses a 
DNA sequence of interest and HIV cis-acting sequences r«,uir.d for packaging, reverse 
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transcription and integration. In third embodiment, the packaging cell lines further 
comprise a retroviral nucleotide sequence which comprises a DNA sequence of interest 
and HIV cis-acting sequences required for packaging, reverse transcription and 
integration. Alternatively, the packaging cell lines of the present invention comprise a" 
5 retroviral nucleotide sequence which comprises a codon optimized gag coding sequence 
and (2) a retroviral nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the retroviral nucleotide sequence which comprises a codon 
optimized gagpol coding sequence. 

The coding sequence(s) for lentivirus gagpol which has (have) been codon 

1 0 optimized results in improved expression of the lentivirus gagpol proteins and reduces 
the risk of recombination between the transfer vector and gagpol messenger RNA. 
Codon optimization of the coding sequence(s) for lentivirus gagpol was obtained by 
mutagenizing for each particular amino acid residue, specific nucleic acid bases in a 
codon for the particular amino acid residue to a nucleic acid base which is present in a 

15 codon which occurs at a high frequency in genes which are highly expressed for the 
same amino acid residue. In a particular embodiment, the resulting optimized codon 
also does not cause introduction of mRNA splicing signals into the codon optimized 
sequence. Thus, in a particular embodiment, codon optimization of the coding 
sequence(s) for lentivirus gagpol is obtained by mutagenizing for each particular amino 

20 acid residue, specific nucleic acid bases in a codon for the particular amino acid residue 
to a nucleic acid base that is present in a codon which (1) occurs at a high frequency in 
genes which are highly expressed for the same amino acid residue and (2) does not 
cause introduction of mRNA splicing signals into the codon optimized sequence. 
Codon optimization typically results in the removal of nucleic acid base A-rich 

25 instability elements. 

In a particular embodiment, the coding sequence for a HIV gagpol (pNL4-3; 
available through the AIDS repository, NIH; Adachi etai^ J, Virol, 59:284-291 (1986)) 
has been codon optimized to improve translational efficiency of the HIV gagpol 
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p^tei,. and reduce d,e of recombinaSon be^een d,e uansfer vector »,d HIV 
gagpo,,ness.,,gerR>.A.Twoh„ndred.,irty-sevenba3epair=(237bp,co„sUn„go*^ 

Lpo.over.apa.,dcis-.c,ingsign^s,.eccssaryfor— „ofp«Unuc.eo.^es2S. 

,„2819ofSEQIDNO: 12, were no. opM^d. The HIV g^gpo/ sequence obuur,ed 
: „stagthecodonop.imi^.ionprocessdoe.no.differa.*eaminoacid.evellron,.be 

wUdlp. HIV ,.s^, sequence, bn, differs a, .be nuc,eo.ide ,e.e, to„ .be H,V s^^o, 
sequence. A codon op.imlzed HIV g-g sequence is sho™ m Figures 8A-8E 
(p^DMHgpna-seq) (SEQ ID N0:4). A codon opdnrized HIV H -.uence ,s sbown 
Figu«s9A.9L(pHDMHBpna.seq)(SEQ!DNO-.10), 

A plasmid comprising DNA sequences which encode codon op„nr,zed .en.,v^s 
„S,„, proteins is also referred .0 herein as a packaging cons^uC. This piasnnd 
ncTudes a pron.o.r which drives .be expression of.be gagpo, proreh., such . *e 
hnnrancytoniegalovirusfhCMV) imnrediate eari, promoter. Tins plasnnd .sdefec ve 
LuctLoftheviraU„veiopeandaecessorvpro„ins..,v..vpr,vpu nefand 

The Dackasing construct also does not 
IS rev and the Rev response element (RRE). 1 he packaging 

::n:„v.raisequenceswhichare.ranscrrbed,n.on,KKA,suchasconsbnmve.,anspor. 

PigurelandinFigurel,. Figu^s ,OA-.OD deprc. the DNA s^uence (SEQ ID 
20 NO-,2)for.bepa*agingconsb.ctpHDMHgpn.2. This paCaging co„s.™c. 
rnHDMHgpn.2) was consbuCed as follows: Plasmid pMDA.HIVgp n« was 

S,emn,er««/.,G«c.;«:49-53 (.995))of2.5di.rercn.oligonuc,eo.,des.TheDNA 

sequence for pMDA.HIVgp nram is *e same as *e DNA sequence for 
,5 pN«>A.HIVgp,ge.cep,for4.3.bwhicbwascodo„op.,mr.dnsmg.be~ 
program (LaserGene. Madison, WI). Two hundred tbirtyseven base patrs 2.7 bp 
:nLoftbegagpo>overlapandcis.ac..gs,gnalsnecessar,f..r— 
(nucleCides 2583 .0 28.9 of SEQ ID NO: .2, were no. op.»aed due .o dual readmg 
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frame constraints. A Nsil site 5* of IN was preserved to aid fusion with wildtype 
sequences. Several single or double base pair silent mutations were introduced either to 
prevent potential splice donors and acceptors, or by the synthesis process. 
pMDA.HIVgp jtg was derived from HIV- 1 strain NL4-3, The protease mutation tharis 

5 present in the NL4-3 NIH GenBank sequence was then repaired (Figure 9B), changing 
the nucleotide present at position 2948 of SEQ ID NO: 12 from a "G" to a "C", thereby 
producing the codon present at nucleotide positions 2948 to 2950 of SEQ ID NO: 12 
which encodes an arginine instead of the glycine present in the NL4-3 GenBank amino 
acid sequence. The resulting plasmid was named pMDHgpmam. The EcoRI-Hindlll 

10 fragment of pMDHgpmam was inserted into pHDM2b, a high copy version of the pMD 
vector (Ory, D. et aL, Proa Natl, Acad Sci. USA, 93(21): 1 1400-1 1406 (1996)), to 
produce plasmid pHDMHgpm. The sequencing mutation that is present in the RNase 
domain of the NL4-3 NIH GenBank sequence was repaired (Figure 9H), changing the 
codon present at nucleotide positions 4724 to 4726 of SEQ ID NO: 12 from "GGG" to 

15 "AAG", thereby producing a codon encoding a lysine instead of the glycine present in 
the NL4-3 GenBank amino acid sequence. The resulting plasmid was named 
pHDMHgpm2. Codon usage frequencies in the codon optimized gagpol open reading 
frame of the packaging construct pHDMHgpm2 are shown in Figure 2. 

As used herein, a heterologous envelope protein permits pseudotyping of 

20 particles generated by the packaging construct and includes the G glycoprotein of 
vesicular stomatitis virus (VSV G) and the amphotropic envelope of the Moloney 
leukemia virus (MLV). A plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein is also referred to herein as an envelope coding plasmid. 
The terms "mammal" and "mammalian", as used herein, refer to any vertebrate 

25 animal, including monotremes, marsupials and placental that suckle their young and 
either give birth to living young (eutharian or placental mammals) or are egg-laying 
(metatharian or nonplacental mammals). Examples of mammalian species include 
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humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, 
guinea pigs) and ruminents (e.g., cows, pigs, horses). 

Examples of mammalian cells include human (such as HeLa cells, 293T cells^ 
NIH 3T3 cells), bovine, ovine, porcine, murine (such as embryonic stem cells), rabbit 
5 and monkey (such as COSl cells) cells. Hie cell may be a non-dividing cell (including 
hepatocvtes. myofibers, hematopoietic stem cells, neurons) or a dividing cell. The cell 
n,ay be an embryonic cell, bone marrow stem cell or other progenitor cell. Where the 
cell is a somatic cell, the cell canbe, for example, an epithelial cell, fibroblast, smooth 
muscle cell, blood cell (including a hematopoietic cell, red blood cell, T-cell, B-cell, 
,0 etc.) tumor cell, cardiac muscle cell, macrophage, dendritic cell, neuronal cell (e.g., a 
glial cell or astrocyte), or pathogen-infected cell (e.g., those infected by bactena, 
viruses, vinisoids, parasites, or prions). 

Typically, cells isolated ftom a specifc tissue (such as epitheliutn, fibroblast or 
hematopoietic cells) are categorized as a -cell-type.- The cells can be obtatned 
, 5 conunercially or from a dcpositoty or obtained directly ftonr an antaal. sue as by 
hiopsy. Aitemattvely, the cell ne^ no, be isolated at all frotn the anitnal where, tor 
example it is desirable to deliver the vints to the »iimal in gene therapy. 

To produce the celi lines of the pr^ent invention for producing a viral accessory 
p„,ein independent lentivints-derived retroviral vector particles, mammalian host cells 
20 are co-transfected with (a) a first plasmid comptising DN A sequence which encode 
ientivirus ,a^, proteins, wherein said DNA sequence has been codon ophmtzed by 
mutagenisis, as described above, to improve expression of the Ientivirus ga^l 
proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which compnses a 
,5 DNA sequence of interest and lenuvims cis-acting sequences required tor packagmg, 
reverse transcription and integration, or both, under conditions appropnatc for 
transfection of the cells. 
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In a particular embodiment, to produce the cell lines of the present invention for 
producing viral accessory protein independent HI V-derived retroviral vector particles 
mammalian host cells were cotransfected with (a) a first plasmid comprising DNA 
sequence which encode HIV gagpol proteins, wherein said DNA sequence has been 
5 codon optimized by mutagenisis, as described above, to improve expression of the HIV 
gagpol proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
transcription and integration, or both, under conditions appropriate for transfection of 
10 the cells. 

Virus stocks consisting of viral accessory protein independent lentivirus-derived, 
particularly HIV-derived, retroviral vector particles of the present invention are 
produced by maintaining the transfected cells under conditions suitable for virus 
production (e.g., in an appropriate growth media and for an appropriate period of time). 

15 Such conditions, which are not critical to the invention, are generally knovm in the art, 
See, e.g., Sambrook et ai. Molecular Cloning: A Laboratory Manual, Second Edition, 
Cold Spring Harbor University Press, New York (1989); Ausubel et al.. Current 
Protocols in Molecular Biology, John Wiley & Sons, New York (1998); U.S. Patent 
No. 5,449,614; and U.S. Patent No. 5,460,959, the teachings of which are incorporated 

20 herein by reference. 

To generate viral accessory protein independent lentivirus-derived retroviral 
vector particles, mammalian host cells can be co-transfected with (a) a first plasmid 
comprising DNA sequence which encode lentivirus gagpol proteins, wherein said DNA 
sequence has been codon optimized by mutagenisis, as described above, to improve 

25 expression of the lentivirus gagpol proteins; (b) a second plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein; and (c) a third plasmid 
comprising a DNA sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. Altematively, mammalian cells are 
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transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
lentivims gag protein and a plasmid comprising a codon optimized DNA sequence 
encoding a lentivirus pol protein, in place of the first plasmid comprising a codon 
optimized DNA sequence encoding both lentivirus gagpol proteins. Alternatively, 
5 mammalian host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the first plasmid 
comprising a codon optimized DNA sequence encoding both lentivirus gagpol protems. 
In a particular embodiment, the invention relates to methods of producing viral 
10 accessory protein independent HlV-derived retroviral vector particles, comprising co- 
transfecting mammalian host cells with (a) a first plasmid comprising DNA sequence 
which encode HIV gagpa/ proteins, wherein said DNA sequence has been codon 
optimized bv mutagenisis, as described above, to improve expression of the HIV gagpol 
proteins; (b) a second plasmid containing a DNA sequence which encodes a 
, 5 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcnption and 
integration. Alternatively, mammalian host cells are transfected with a plasmid 
comprising a codon optimized DNA sequence encoding a HIV gag protein and a 
plasmid comprising a codon optimized DNA sequence encoding a HIV pol protein, m 
20 place of the first plasmid comprising a codon optimized DNA sequence encodmg both 

HIV gagpol proteins. 

Virus particles produced by the methods described herein, using a codon 
optimized HIV packaging construct produced as described herein, were compared by 
Western analvsis with virus particles produced as described in Naldini et al.. Science, 
25 2 72:263-267 (1996), using the packaging construct plasmid pCMVAR8.2. Both the 
immunological reactivity and the proteolytic processing were confirmed to be 
indistinguishable. 
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A plasmid comprising a DNA sequence of interest and HIV cis-acting sequences 
required for packaging, reverse transcription and integration is also referred to herein as 
a transfer vector. A transfer vector, as used herein, refers to a vehicle which is used to 
introduce a DNA of interest into a eurkaryotic cell, particularly a mammalian cell, 

5 Figure 3 depicts an example of a transfer vector. 

DNA sequence of interest as used herein, include all or a portion of a gene or 
genes encoding a nucleic acid product whose expression in a cell or a mammal is 
desired. In a particular embodiment, the nucleic acid product is a heterologous 
therapeutic protein. Examples of therapeutic proteins include antigens or immunogens, 

1 0 such as a polyvalent vaccine, cytokines, tumor necrosis factor, interferons, interleukins, 
adenosine deaminase, insulin, T-cell receptors, soluble CD4, growth factors, such as 
epidermal growth factor, human growth factor, insulin-like growth factors, fibroblast 
growth factors), blood factors, such as Factor VIII, Factor IX, cytochrome b, 
glucocerebrosidase, ApoE, ApoC, ApoAL the LDL receptor, negative selection markers 

15 or "suicide proteins", such as thymidine kinase (including the HSV, CMV, VZV TK), 
anti-angiogenic factors, Fc receptors, plasminogen activators, such as t-PA, u-P A and 
streptokinase, dopamine, MHC, tumor suppressor genes such as p53 and Rb, 
monoclonal antibodies or antigen binding fragments thereof, drug resistance genes, ion 
channels, such as a calcium channel or a potassium channel, adrenergic receptors, 

20 hormones (including grov^h hormones) and anti-cancer agents. In another embodiment, 
the nucleic acid product is a gene product to be expressed in a cell or a mammal and 
which product is otherwise defective or absent in the cell or mammal. For example, the 
nucleic acid product can be a functional gene(s) which is defective or absent in the cell 
or mammal. 

25 DNA sequence of interest includes DNA sequences (control sequences) which 

are necessary to drive the expression of the gene or genes. The control sequences are 
operably linked to the gene. The term "operably linked", as used herein, is defined to 
mean that the gene is linked to control sequences in a manner which allows expression 
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of the gene (or the nucleic acid sequence). Generally, operably linked means 
contiguous. 

Control sequences include a transcripUonal promoter, an optional operator 
sequence to control transcription, a sequence encoding suitable mRNA ribosomal 
5 binding sites and sequences ^vhich control termination of transcription and translation. 
In a particular embodiment, a recombinant gene encoding a desired nucleic acid product 
can be placed under the regulatory control of a promoter which can be induced or 
repressed, thereby offering a greater degree of control with respect to the level of the 
product produced. 

,0 AS used herein, .he tern, "promoter- refers to a sequence of DNA, usually 

upstream (?) of the ooding region of a strucmral gene, which controls the expression of 
the coding region by providing recognition and binding sites for RNA polymerase and 
other factors v-hich may be required for inrtUdon of transcription. SuiUble promoters 
are well known in drc art. Exemplary promoters include rhe SV40, CMV and human 
, 5 elongation factor (EFI) promoters. Other suitable promoters are readily available tn the 
ar, (see e.g., Ausubel « al. Currem Protocols in Molecular Biology. John W.ley & 
Sons Inc New York (1998); Sambrook e, al.. Molecular Cloning: A Laboralory 
Ma^al, 2nd edition. Cold Spring Harbor University Press, New York (1989); and U.S. 
Patent No. 5.681,735). 

20 A DNA sequence of interest can be isolated from nature, modified from native 

sequences or manufactured de no.o, as described in, for example, Ausubel et al., 
Current Protocols in Molecular Biology, John Wiley & Sons, New York (1998); and 
sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd edition. Cold Spnng 
Harbor University Press, New York. (1989). DNA sequences can be isolated and fused 

25 together by methods known in the art, such as exploiting and manufacturing compatible 

cloning or restriction sites. 

The packaging cell lines and viral particles of the present invention can be used, 
.in vitro, in vivo and ex vivo, to introduce DNA of interest into a eukaryotic cell (e.g.. a 
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mammalian cell) or a mammal (e.g., a human or other mammal or vertebrate). The cells 
can be obtained commercially or from a depository or obtained directly from a mammal, 
such as by biopsy. The cells can be obtained from a mammal to whom they will be 
returned or from another/different mammal of the same or different species. For 

5 example, using the packaging cell lines or viral particles of the present invention, DNA 
of interest can be introduced into nonhuman cells, such as pig cells, which are then 
introduced into a human. Alternatively, the cell need not be isolated from the mammal 
where, for example, it is desirable to deliver vial particles of the present invention to the 
mammal in gene therapy. 

10 Ex vivo therapy has been described, for example, in Kasid et ai, Proc. Natl 

Acad. ScL USA, 57:473 (1990); Rosenberg et aL N, Engl 1 Med, 323:570 (1990); 
Williams et al, Nature, 570:476 (1984); Dick et aL Cell 42:71 (1985); Keller et al. 
Nature, 375: 149 (1985); and Anderson et al. United States Patent No. 5,399,346. 

Methods for administering (introducing) viral particles directly to a manmial are 

1 5 generally known to those practiced in the art. For example, modes of administration 
include parenteral, injection, mucosal, systemic, implant, intraperitoneal, oral, 
intradermal, transdermal (e.g., in slow release polymers), intramuscular, intravenous 
including infusion and/or bolus injection, subcutaneous, topical, epidural, etc. Viral 
particles of the present invention can, preferably, be administered in a pharmaceutically 

20 acceptable carrier, such as saline, sterile water, Ringer's solution, and isotonic sodium 
chloride solution. 

The dosage of a viral particle of the present invention administered to a 
mammal, including frequency of administration, will vary depending upon a variety of 
factors, including mode and route of administration; size, age, sex, health, body weight 
25 and diet of the recipient mammal; nature and extent of symptoms of the disease or 
disorder being treated; kind of concurrent treatment, frequency of treatment, and the 
effect desired. 
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The teachings of all the articles, patents, patent applications and GenBank 
sequences cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art thaT 
5 various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 
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CLAIMS 

What is claimed is: 

1 . A packaging cell line for producing a viral accessory protein independent HIV- 
derived retroviral vector particle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HFV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

2. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

3. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

4. A packaging cell line of Claim 1 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

5. A packaging cell line comprising: 
a) a mammalian cell; 
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b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a fflV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a " 
DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

6. A packagbg cell line of Claim 5 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

7. A packaging cell line comprising: 

a) a mammalian cell; 

b) a fu-st retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

8. A method of producing a packaging cell line for producing a viral accessory 
protein independent HIV-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells vnth: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gag and pol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 
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c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

9, A method of Claim 8 wherein the heterologous envelope protein is the G 
5 glycoprotein of vesicular stomatitis virus (VS V G). 

1 0, A method of Claim 8 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

11, A method of Claim 8 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

10 12. A method of producing a viral accessory protein independent HlV-derived 

retroviral vector particle comprising co-transfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been mutagenized to improve 

expression of the HIV gagpol proteins; 
15 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequences required for packaging, reverse transcription and 

integration. 



20 13. A method of Claim 12 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 



wo 00/15819 



PCTAJS99/20675 



-24- 

14. A method of Claim 12 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

15. A method of Claim 12 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

5 16. A packaging cell line for producing a viral accessory protein independent 
lentivirus-derived retroviral vector particle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a lentivirus gagpol, wherein said coding sequence 

10 has been mutagenized to improve expression of the lentivirus gagpol 

proteins; 

a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and 
d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
15 sequence of interest and lentivirus cis-acting sequences required for 

packaging, reverse transcription and integration. 



c) 



17. 



A packaging cell line of Claim 16 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 



18. A packaging cell line of Claim 16 wherein the heterologous envelope protein is 
20 the amphotropic envelope of the Moloney leukemia virus. 

19. A packaging cell line of Claim 16 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 
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20. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence-has 

5 been mutagenized to improve expression of the lentivirus gagpol 

proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required 
for packaging, reverse transcription and integration. 

10 21. A packaging cell line of Claim 20 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 

22. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 

1 5 coding sequence for lentivirus gagpoU wherein said coding sequence has 

been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

20 23. A method of producing a packaging cell line for producing a viral accessory 

protein independent lentivirus-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
25 improve expression of the lentivirus gag and pol proteins; 
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b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and- 
integration. 

A method of Claim 23 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 



25. A method of Claim 23 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

10 26. A method of Claim 23 wherein the DNA sequence of interest is a heterologous 
dierapeutic protein. 



A method of producing a viral accessory protein independent lentivirus-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 

^ 5 gagpol proteins, wherein said DNA sequence has been mutagenized to 

improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
20 cis-acting sequences required for packaging, reverse transcription and 

integration. 

28. A method of Claim 27 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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29. A method of Claim 27 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

30. A method of Claim 27 wherein the DNA sequence of interest encodes a — 
heterologous therapeutic protein. 

5 31. A viral accessory protein independent HI V-derived retroviral vector particle 

produced by the method comprising co-transfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been mutagenized to improve 

expression of the HIV gagpol proteins; 
10 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequences required for packaging, reverse transcription and 

integration. 

1 5 32. A method of Claim 3 1 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

33. A method of Claim 31 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 



34. 

20 



A method of Claim 3 1 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 
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35. A viral accessory protein independent lentivirus-derived retroviral vector 

particle produced by the method comprising co-transfecting mammaiian host 
cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus- 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration. 



36. 



A method of Claim 35 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 



37. A method of Claim 35 wherein the heterologous envelope protein is the 
1 5 amphotropic envelope of the Moloney leukemia virus. 

38. A method of Claim 35 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

39. Isolated DNA encoding a codon optimized fflV gagpol. 

40 . Isolated DNA encoding a codon optimized fflV gag. 

20 41 . Isolated DNA of Claim 40 comprising the nucleotide sequence of SEQ ID N0:4. 



42. 



Isolated DNA encoding a codon optimized HIV pol. 
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43. Isolated DNA of Claim 42 comprising the nucleotide sequence of SEQ ID 
NO:10. 

44. A method of introducing a DNA sequence of interest into a mammal comprising 
introducing into said mammal a viral accessory protein independent HIV- 

5 derived retroviral vector particle comprising the DNA sequence of interest. 

45. The method of Claim 44 wherein the mammal is a human. 

46. The method of Claim 44 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

47. A method of introducing a DNA sequence of interest into a mammal comprising 

10 the steps of: 

a) introducing into cells a viral accessory protein independent HIV-derived 
retroviral vector particle comprising the DNA sequence of interest; and 

b) returning the cells obtained in step a) to the mammal. 

48. The method of Claim 47 wherein the mammal is a human. 

1 5 49. The method of Claim 47 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 



wo 00/15819 



1/29 



PCT/US99/20675 



BspHI 



BamHI ■ 



BspMI 
BspMI 



Nsil 



BamHI 
PpuMI 



CQ 



pV BspMI 
< 



Hgal 



BspMI - 
BspMI- 



.LI 



BspHI- 



PpuMI - 
BspMI- 



PpuMi- 
HgaL- 
Hgal- 



is I 



5 



EcoRI J 













._2 

; :5 



5 



l5 



it 



IS 



1 



L 



i 

t \ \ ^ 

- M ■ 

: ii 



I 



wo 00/15819 



PCTAJS99/20675 



2/29 



C/5 

c 

3 

CT 

Urn 

C 

o 
■a 
o 

U 



mam 


vo 00 a> 

^ Tt ^ ^ 


' ^ 2 ?5 c> :2 


»r> 

— to — — 


O 
O 




»o (N VO 




t-- 

- to ^ <N 

) 


a J5 ^5 ^ vo 


CN oo — c^ 
to rv) 


O 
O 




00 m VO 


Amino Acid 


CU CU D-t 

V / V ^ 

2 9 2 9 

Ui )^ l-t l»t 

0^ Oh Oh 

CO O 3 
U O o o 
O O O O 


S w s ^ s s 

U If^ t- Ui u- 

(L> <l> D (U O <U 

00 CO on 00 c/D 00 

O 3 CO o 00 3 

00 00 o Q o o 

CO CO 3 3 3 3 


bbbb 

t« U U 

H P F H 

CO O 00 3 

o o o u 

CO CO CO CO 


^ s 

00 
00 
3 


>. > 

O 3 
CO CO 
3 3 


> > > > 

CO CO CO cO 

> > > > 

«J O 00 3 
3 3 3 3 
00 00 00 00 


E 

CO 


O CN 
^ CN I— 




~ 


OO ^- 
<N lO ^ 




00 CN 
^ CO 


o 
o 


o o 

OO 


pNL4-3 
gagpol 


^ r- ^ 
— CN ^ 


^ AO 


r-- 

(N 


12 2 n z § *^ 


ON ^ 

VO m 


o 

o 


o o 

-•^f VO 


Amino Acid 


o o o o 

^ ^ ^ >^ 

o o o o 

CO O &£> 3 
CiO CIO OD too 

00 tio ^J0 Guo 


to w 

£ X- 

o 3 
CO CO 

o o 


^ ' • 

4) <D ^ 
t-^ ^— t 

CO O 3 
3 3 3 
CO CO CO 


<D 55 <D a> 0) <i> 
H-J ^ ^ J -) »-l 

2 O 00 3 CO 00 

O O U O 3 3 


CO CO 

CO 00 

CO CO 
CO CO 


1 

so 

3 

CO 


£ 5 

a. Oh 

O 3 
3 3 
3 3 


mam 


—* V-> ^ r-H 


O OO ^ ^ 




>o »o 


OO fN| 

^ 




<N OO 
00 






pNL4-3 
gagpol 


^ ^ tn ^ < 
O (N — i V 


S ^ ^ O O 1 




O O 




VO 


o o 


Amino Acid 


S S N / N y 

<<<<i 

:0 CO CO CO 

<<<< ^ 

:0 o bO 3 
J o o o 
3J) tlO 00 00 


00 op "oo ^ "oo "oo ^ 

<<<<<< . 

CO 00 CO o OO 3 
00 00 00 00 00 00 
fO CO o o o o 


3 

W CO 

< < 

D 3 

-a CO 

■0 CO 


So 

a, c 

co to 

< < 

O 3 
CO CO 
00 00 


CO CO 

u u 

O 3 
00 00 
3 3 


s r 

C 3 

o o i 

CO 00 

CO CO 

o o 


J? 5 

3' "s" 
3 0 

CO 00 
CO CO 
00 00 



fN 



wo 00/15819 



PCTAJS99/20675 




wo 00/15819 



PCT/US99/20675 



A/29 



Rev 

• Regulates HIV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
mRNAs 

• Postulated to affect splicing, stability, transport, 
and translation 



Fig. 4 
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Codon Optimization of UTV gagpol 

• Remove A-rich instability elements 

• Improve translational efficiency 

• Reduce risk of recombination with 

transfer vector 



Fig. 5 
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Alignment Report of Codon optimization (gag).MEG. using Clustal method with PAM250 residue weight table. 
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NL4-3 genbank.SEQ 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pHDMKgpm2.seq 

NL4-3 genbank.SEQ 
pHDMHgpin2.seq 

NL4-3, genbank.SEQ 
pHDMHgpm2.seq 

Nii4-3 genbank.SEQ 
pHDMHgpin2 . s eq 

NL4-3 genbank.SEQ 
pHDMHgpm2. seq 

NL4-3 genbank.SEQ 
pHDMHgpm2. seq 
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/Mignn-en. Report of Codon op«n,iza«on (^g).MEG. using O.^ n,e,hcd PAM250 residue weigh, tab.e. 



1170 



j-^ ^ 5 5 Y i I ~ NL4-3 genbank.SEQ 

IS O^C <L A.C C.0 OTC .00 CA. A.T TAC OCT ATA CTC 

ir,9 C^C ACC 4 A^C A^C TCC C^O OTC TCC CAO AAC^TACCCCATC^ 

■ t 

' " 1230 
1200 -L- 



;^ - ^Tl i ^ ~ NL4-3 genbank.SEQ 

^^^'^ ^ ^ GGG CAA ATG GTA CAT CAG GCC ATA TCA CCT AGA 

1197 CAG AAC CTC CAG GGG CAA ATG ^^"^ ^ ^ j g P R pHDMHgpm2 . seq 



^'2\ CAG A^C CTG C^G GGC C^G ATG GTG CAC CAG GCC ATC TCC CCC CGC 



1260 



NL4-3 genbank.SEQ 



1242 T L N A W V K V V 
1242 ACT TTA AAT GCA TGG GTA AA^ GTA GTA GAA GA ^ ^ 3 ,HDMHgpm2 . seq 

^69 ACC CTG A^C GCC TGG GTG AAG GTG GTG GAG GAG AAGGCCTT^ 

~ — ' 1 

^ ' 1320 
1290 1- 



1 E SE G A ~ NL4-3 genbank.SEQ 

1^87 cL. OTA ATA CCC AT. r.T TCA .CA T.A TCA CAA CCA CCC ACC ^^^^ ^^^ 

CCC <L GTC ATC CCC ATG TTC TCC CCC CTG TCC CAC CGC CCC ACC 



-r 
1350 



^ — -L- £ ^ 5 G H NL4-3 genbank.SEQ 

1332 P Q D L N T 

1332 CCA CAA GAT TTA AAT ACC A.G CTA AAC ACA ^ G G H Q pHDMHgpin2 . seq 

1859 P Q 13 L N T M L « _ _ p,,^. 



1859 



CCC CAG G^C CTG ^ ACC ATC CTG AAC ACC GTG GGC GGC CAC CAG 



— I 

" 1410 



1380 



i ; ^ ^ T I N SEA 

A A M Q M L K E i 



NL4-3 genbank.SEQ 



11]] GCA GCC ATG cL ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT GCA ^^^^^ ^^^ 
\Z. GCC GCC ATG C^G ATG CTG AAG GAG ACC ATC AAC GAG GAG GCC GCC 



1440 



1949 E 
1949 



5 ^ V "h A i P I A P NL4-3 genbank.SEQ 

Wll ^ TGG G^T A^GA T^G CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 
C^G TGG GAC C^C C^G CAC CCC GTG CAC GCC GGC CCC ATC GCC CCC 

' r 

1500 

r Q 1 5 i ^A G T T~ NL4-3 genbank.SEQ 

GGC cJg ATG aL <L CCA AGG GGA ACT GAC ATA GCA GGA ACT ACT 
CGC C^G ATG CGC G^G clc clc GGC TCC GAC ATC GCC GGC ACC ACC 



Fig. 8B 
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Alignment Report of Codon optimization (gag).MEG, using Clustal method with PAM250 residue weight table 



— j= 
1530 



1512 STLQEQIGWMTHNPP NL4-3 genbank.SEQ 

1512 AGT ACC CTT GAG GAA CAA ATA GGA TGG ATG ACA CAT AAT CCA CCT 

2039 STLQEQIGWMTHNPP pHDMHgpin2 . seq 

2039 TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 



1560 



1 

1590 
1 



1557 IPVG.EIYKRWIILGL NL4-3 genbank.SEQ 

1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 

2084 IPVGEIYKRWI ILGL pHDMHgpni2 . seq 

2084 ATC CCC GTG GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 



1620 



NKIVRMYSPTSILDI NL4-3 genbank.SEQ 
AAT AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 
NKIVRMYSP TSILDI pHDMHgpni2 . seq 



1602 
1602 
2129 

212 9 AAC AAG ATC GTG CGC ATG TAC . TCC CCC ACC TCC ATC CTG GAC ATC 



1647 
1647 
2174 
2174 



1 

1650 



1680 



RQGPKEPFRDYVDRF 
AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TAT GTA GAC CGA TTC 

RQGPKEPFRDYVDRF 
CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 



NL4-3 genbank.SEQ 
pHDMHgpin2 . seq 



1 

1710 



1692 Y KT L RAEQASQ EVK.N NL4-3 genbank.SEQ 

1692 TAT AAA ACT CTA AGA GCC GAG CAA GCT TCA CAA GAG GTA AAA AAT 

2219 YKTLRAEQASQEVKN pHDMHgpni2 . seq 

2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 



1 

1740 



1770 



WMTETLLVQNANPDC NL4-3 genbank.SEQ 
TGG ATG ACA GAA ACC TTG TTG GTC CAA AAT GCG AAC CCA GAT TGT 
WMTETLLVQNANPDC pHDMHgpia2 . seq 



1737 
1737 
2264 

2264 TGG ATG ACC GAG ACC CTG CTG GTG CAG AAC GCC AAC CCC GAC TGC 

■ — ' ! — 

1800 
I 

1782 KTILKALGPGATLEE NL4-3 genbank.SEQ 
1782 AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 
2309 KTI LKALGPGATLEE pHDMHgpn\2 . seq 
2309 AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 



1830 



1 

1860 



MMTACQGVGGPGHKA NL4-3 genbank.SEQ 
ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA- 
MMTACQGVGGPGH KA ■pHDMHgpm2 . seq 



1827 
1827 
2354 

2354 ATG ATG ACC GCC TGC CAG GGC GTG GGC GGC CCC GGC CAC AAG GCC 



Fig. 8C 
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Alignment Report of Codon optimization (gag).MEG, using Clustal method with PAM250 residua v««ght table. 



1390 



— — ^ E A M i Q 5 P A T NL4-3 genbaak.SEQ 

1872 AGA GTT TTG GCT GAA GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 

2399 R V L A E A M S Q V T N P A T pHDMHgpm2 . secT 
CGC GTG CTG GCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 



2399 



-—J - - ^ ^ ^ T V NL4-3 genbank.SEQ 

1917 ATA ATG ATA CAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 

2444 I M I Q K G N F R N Q R K T V pHDMHgpm2 . =eq 

2444 ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 



1980 



2 — E i S E G i G 5 i A K N C^ NL4-3 genfaank.SEQ 

1962 AAG TGT TTC AAT TGT GGC AAA GAA GGG CAC ATA GCC AAA AAT TGC 
2489 T C F » C G K E G H I A X N C pHDMHgpm2 
2489 AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC .AAG AAC TGC 



, sea 



2010 



2040 



2007 R A 1 K G C W K C G K E G NL4-3 genbank.SEQ 

2007 AGG GCC COT AGG AAA AAG GGC TGT TGG AAA TGT GGA AAG GAA GGA 

2007 AGG GCC C^ R K K G C W K C G K E G pHDMHgpm2 . seq 

2534 CGC GCC CCC CGC AAG AAG GGC TGC TGG AAG TGC GGC .^AG GAG GGC 



( 

2070 



2052 H Q M K D C T E R Q A N F L G ge.-ibank . SEQ 

2052 CAC CAA ATG AAA GAT TGT .ACT GAG .AGA CAG GCT AAT TTT TTA GGG 

2052 CAC CAA A. i> ^ ^ ^ _ ^ ^ ^ ^ ^ ^ g pKDMHgpmZ.seq 

2^9 CAC GAG ATG AAA GAT TGT ACT GAG AGA CAG GCT .AAT TTT TTA GGG 



-r 



2100 



2097 K I W P S H K G R g G H F L Q NL4-3 genbaak.S.Q 

2097 AAG ATC TGG CCT TCC CAC .^G GGA AGG CCA GGG AAT TTT CTT CAG 

2624 T i V P 3 H K G R P G N F L Q pHDMHgpm2 . seq 

262 4 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 



2160 

' ^ s F R ~ NL4-3 genbank.SEQ 



2142 SRPEPTAPPSE 

2^42 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 
26o5 S r r E P T A P P S E S F R F pHDMHgpm2 . seq 



2659 S 
2669 AGC 



AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 



-r 



"T" 



2220 

2190 ZZT 



^ ^ I Q ^ Q 1 p T D NL4-3 genbank.SEQ 



B T T T T T T T T f T T T f T pk«h=p-.», 

IVll GGG GAA GAG .ACA ACA ACT CCC TCT CAG .AAG CAG GAG CC3 .ATA GAC 



Fig. 8D 



wo 00/15819 



PCT/US99/20675 



12/29 



Alignment Report of Codon optimization (gag). MEG. using Clustal method with PAM250 residue weight table. 



I 

2250 



2232 AAG gL CTG tIt CCT T^A GCT tIc CTC aIa tIa CTC GGC aIc ^^^'^ 9^^^nk.SZQ 

2759 KELYPLASLRSLFGS 

2759 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC ^^^^^"^'^^ 



1 

2280 



2277 D P S S Q * "■• 

2277 GAC CCC TCG TCA CAA iwv "^^"^ genbank.SEQ 

2804 D P S S Q ■■ 

2804 GAC CCC TCG TCA CAA TAa' pHDMHgpn.2 . seq 



Fig. 8E 
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Alignment Report of Codon Optimization (pol).MEG. using Clustal method with PAM250 residue ^^ight tabie. 

I 

2120 



2090 



r 1 1 ^ i Q Tk a R ~ NL4-3 genbank.SEQ 

III] /r. TTT .0. 0^ CO sec XTC CC. 3^ - .0. ^ ^,,,.3.,,,- 

llll T?T .CC <^ CT. OCC TTC CO. MS SCC AOS ^ 

2S2 TTT AGS GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 



2150 
' 
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TTT 
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GAG 


CAG 
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TTT 


TCT 
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GAG 
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GCC 
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F 
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T 


R 


A 


N 


TTT 


TCT 


TCA 


GAG 


CAG 


ACC 


AGA 


GCC 


AAC 



NL4-3 genbank.SEQ 



2132 

2132 TTT TCT TCA GAG CAG AC^ a^ ..v. ^ ^ pHL4-3.seq 

2130 - ~ ^-^-TSAN 

2130 TTT TCT TCA GAG CAG A.. ^ ^ pHDMHgpm2 . seq 

2^^"^ Jir. ^'r^r^ n^.a r,rr AAC AGC CCC ACC AGA AGA GAG 
2657 



2210 

2180 L__ 

T s E A G NL4-3 genbank.SEQ 
2X77 - - ^'^ 1^ ^ 

2177 CTT CAG GTT TGG GGA A^ ^ ^ pNL4-3.seq 

ot-7c; rTT TAG GTT TGG GGA AGA tiAL. /^au ^^^w j.^^- 

2175 CTT ^ ^ S L S E A G pHDMHgpm2 . seq 

2702 

2702 



L 


-J 

Q 


V 


W 


G 


R 


D 


N 


N 


CTT 


CAG 


GTT 


TGG 


GGA 


AGA 


GAC 


AAC 


AAC 


L 


Q 


V 


W 


G 


R 


D 


N 


N 


CTT 


CAG 


GTT 


TGG 


GGA 


AGA 


GAC 


AAC 


AAC 


L 


Q 


V 


W 


G 


R 


D 


N 


N 


CTT 


CAG 


GTT 


TGG 


GGA 


AGA 


GAC 

■ 1 


AAC 


AAC 















2240 
r 


2222 


A 
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R 


Q 


G 


T V 


2222 


GCC 


GAT 


AGA 


CAA 


GGA 


ACT GTA 


2220 


A 


D 


R 


Q 


G 


T V 


2220 


GCC 


GAT 


AGA 


CAA 


GGA 


ACT GTA 


2747 


A 


D 


R 


Q 


G 


T V 


2747 


GCC 


GAT 


AGA 


CAA 


GGA 


ACT GTA 



s 



NL4-3 genbank.SEQ 
pNL4-3.seq 



3 p P Q I T pHDMHgpm2 . seq 
GTA TCC TTT AGC TTC CCT CAG ATC ACT 





2270 










2267 


L W 


Q 


R 


P 


L 


2267 


CTT TGG 


CAG 


CGA 


CCC 


CTC 


2265 


L W 


Q 


R 


p 


L 


2263 


CTT TGG 


CAG 


CGA 


CCC 


CTC 


2792 


L W 


Q 


R 


p 


L 


2792 


CTT TGG 


CAG 


CGA 


CCC 


CTC 



— I — 

2300 
t 



NL4-3 genbank.SEQ 



G G Q L pKL4-3.seq 
GGG GGG CAA TTA 
G G Q L pHDMHgpm2 . sec 

GGT GGC CAG CTG 



1 

2330 



V L E NL4-3 genbank.SEQ 



nil 1 ^ /CT cJ» ™ ^ 4 ^ - ™ T ^„,,.3..., 

IWl ^ <L - ™ =»T .a. »T .c. ™ ^ 

=A= £<: Cr= CT. «C .CC «3= =CC «C «C AC= GT. CTG GAG 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAH/1250 residue weight table. 





2360 

. .1 








2357 


E M N 


L 


? 


G 


2357 


GAA ATG AAT 


TTG 


CCA 


GGA 


2355 


E M N 


L 


P 


G 


2355 


GAA ATG AAT 


TTG 


CCA 


GGA 


2882 


E M N 


L 


P 


G 



2390 



W K 



W 



2882 GAG ATG AAC CTG CCC GGC CGC TGG AAG CCC AAG ATG ATC GGC GGC 



2402 



2400 



2927 



2447 
2447 
2445 
2445 
2972 
2972 



2420 

— — L.. 


I 

ATT 


G 


G 


F 


I 


K V 


' G . 


Q 


Y 


D 


GGA 


GGT 


TTT 


ATC 


AAA GTA 


GGA 


CAG 


TAT 


GAT 


I 


G 


G 


F 


I 


K V 


R 


Q 


Y 


D 


ATT 


GGA 


GGT 


TTT 


ATC 


AAA GTA 


AGA 


CAG 


TAT 


GAT 


I 


G 


G 


F 


I 


K V 


R 


Q 


Y 


D 


ATC 


GGC 


GGC 


TTC 


ATC 


AAA GTC 


CGC 


CAG 


TAC 


GAC 



2450 



2480 
_i 



EICGHKAIGTVLV G 7" 

GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

^^^^^KAIGTVLVGP 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

^^^^^KKAIGTVLVGP 
GAG ATC TGC GGC CAC AAG GCC ATC GGC ACC GTG CTG GTG GGC CCC 



2492 
2492 
2490 
2490 
3017 
3017 



T P 
ACA CCT 

T P 
ACA CCT 

T P 
ACC CCC 



2510 
1 

G 



GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 

"GRNLLTQIG 
GGA AGA AAT CTG TTG ACT CAG ATT GGC 

GRNLLTQIG 
GTG AAC ATC ATC GGC CGC AAC CTG CTG ACC CAG ATC GGC 



V N I I 
GTC AAC ATA ATT 

V N r I 



2540 



2570 



2537 
2537 
2535 
2535 
3062 
3062 



C T 
TGC ACT 

C T 
TGC ACT 

C T 
TGC ACC 



^ W P I S P I E T V i ~ 

TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 

^NFPISPIETVPV 
TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 

^NFPISPIETVPV 
CTG AAC TTC CCC ATC TCC CCC ATC GAG ACC GTG CCC GTG 



NL4-3 genbank.SEQ 
pNL4-3 .se^ 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpin2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2.seq 

N'L4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



2582 K 

2582 aa; 

2580 K 

2580 aa;^ 

3107 K 



2600 



G 


M 


D 


G 


P K 


V 


K 


Q 


W 


GGA 


ATG 


GAT 


GGC 


CCA AAA 


GTT 


AAA 


CAA 


TGG 


G 


M 


D 


G 


P K 


V 


K 


Q 


W 


GGA 


ATG 


GAT 


GGC 


CCA AAA 


GTT 


AAA 


CAA 


TGG 


G 


M 


D 


G 


P K 


V 


K 


Q 


W 


GGC 


ATG 


GAC 


GGC 


CCC AAA 


GTC 


AAG 


CAG 


TGG 



NL4-3 genbank.SEQ 
pHL4-3 , seq 
pHDMHgpm2 .seq 
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Alignment Report of Codon Optimization Cpol),MEG. using Clustai method with PAM250 residue weight table. 

2630 2660 

2627 LTSEKIKALVEICTE NL4-3 geabank.SEQ 

2627 TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

2625 LTEEKIfCALVEICTE pNL4-3.seq 

2625 TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

3152 LTEEKIKALVEICTE pHDMHgpm2 . seq 

3152 CTG ACC GAG GAG AAG ATC AAG GCC CTG GTG GAG ATC TGC ACC GAG 



2690 



2672 
2672 
2670 
2670 
3197 
3197 



2717 
2717 
2715 
2715 
3242 



2762 
2762 
2760 
2760 
3287 



2307 
2807 
2805 
2905 
3332 



2852 
2952 
2850 
2850 
3377 
3377 



M 



K 



ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 



M 



K 



N 



ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA ?JKT CCA 



M 



ATG GAG AAG GAG GGC AAG 



I S 
ATC TCC 



K I G 
AAG ATC GGC 



P E N P 
CCC GAG AAC CCC 



^^L4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



1 

2720 



2750 
i_ 



K 



K NL4-3 genbank.SEQ 



TAG AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 



N 



K 



K oNL4- 3. sea 



TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 



N 



V 



K p HDMHgpm2 , s eq 



3242 TAC AAC ACC CCC GTG TTC GCC ATC AAG AAG AAG GAC TCC ACC AAG 



1 

2780 



WRKLVDFRELMKRT 
TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT .AAT .^AC AGA ACT 

WRKLVDFRELMKRT 
TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAG AGA ACT 

WRKLVDFRELMKRT 



3297 TGG CGC AAG CTG GTG GAC TTC C3C GAG CTG AAC AAG CGC ACC CAG 



, 

2810 



2940 



D r W E 
GAT TTC TGG GAA 

D F W E V 
GAT TTC TGG GAA GTT 



QLGIPHPAGL 
CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

QLGIPHPAGL 
CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

QLGIPHPAGL 



3332 GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 



2870 



ML4-3 genbank.SEQ 
pNL4-3 . seq 



Q 
CAA 

Q 
CAA 

Q pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 



DNL4-3. sec 



pHDMHgpm2 . seq 



KQKKSVTVLDVGDAY 
AAA CAG AAA PJ\A TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TA 

KQKKSVTVLDVGDAY 
AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

KQ KKSVTVLDVGDAY pHDMHgpm2 . seq 
AAG CAG AAG AAG TCC GTG ACC GTG CTG -GAC GTG GGC GAC GCC TAC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
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Alignment Report of Codon Optimization (pol).MEG, using Clustai method with PAM250 residue weight table. 



2900 



2930 



2897 


F 


s 


V 


P 


L 


D 


K 


D 


F 


R 


K 


y 


T 
1 


A F 


2897 


TTT 


TCA 


GTT 


CCC 


TTA 


GAT 


AAA 


. GAC 


TTC 


AGG 


AAG 


TAT 


ACT 


GCA TTT 


2895 


F 


S 


V 


p 


L 


D 


K 


D 


c 


D 
t\ 


K 


Y 


T 


A F 


2895 


TTT 


TCA 


GTT 


CCC 


TTA 


GAT 


AAA 


GAC 






AAG 


TAT 


ACT 


GCA TTT 


3422 


F 


S 


V 


p 


L 


D 


K 








K 


Y 


i 


A F 


3422 


TTC 


TCC 


GTG 


CCC 


CTG 


GAC 


AAG 


GAC 






AAG 


TAC 


ACC 


GCC TTC 














2960 
f 
















2942 


t' 


I 


P 


s 


I 


N 


N 




T 


f 


G 


I 


R 


Y Q 


2942 


ACC 


ATA 


CCT 


AGT 


ATA .^AC 


AAT 


GAG 


ACA 


CCA 


GGG 


ATT 


AGA 


TAT CAG 


2940 


T 


I 


p 


s 


I 


N 


N 


c* 




P 


G 


I 


R 


Y Q 


2940 


ACC 


ATA 


CCT 


AGT 


ATA 


AAC 


AAT 






UUA 


GGG 


ATT 


AGA 


TAT CAG 


3467 


T 


I 


p 


s 


I 


N 


N 


t* 


rrt 
i 


P 


G 


I 


R 


Y Q 


3467 


ACC 


ATC 


CCC 


TCC 


ATC 


AAC 


AAC 








GGC 


ATC 


CGC 


TAC CAG 




2990 


















( 

3020 

t 






2987 


Y 


N 


V 


L 


P 


<2 


G 


w 






S 


P 


A 


I F 


2987 


TAC 


AAT 


GTG 


CTT 


CCA 


CAG 


GGA 


TGG 


AAA 


GGA 


TCA 


CCA 


GCA 


ATA TTC 


2985 


Y 


N 


V 


L 


P 


Q 


G 


W 


K 


G 


S 


P 


A 


I F 


2985 


TAC 


AAT 


GTG 


CTT 


CCA 


CAG 


GGA 




a a n 
AAA 


CfTS. 


TCA 


CCA GCA ATA TTC 


3512 


Y 


N 


V 


L 


P 


Q 


G 


W 


w 
i\ 


Ca 


S 


P 


A 


I F 


3512 


TAC 


AAC 


GTG 


CTG 


CCC 


CAG 


GGC 


1 


aar* 




TCC 


CCC 


GCC 


ATC TTC 














3050 
















3032 


Q 


C 


S 


M 


T 


K 


I 


L 


E 


P 


F 


R 


K 


Q N 


3032 


CAG 


TGT 


AGC 


ATG 


ACA 


AAA ATC 


TTA 


GAG 


CCT 


TTT 


AGA 


AAA 


Z^K AAT 


3030 


Q 


C 


S 


M 


T 


K 


I 


L 


E 


P 


F 


R 


K 


Q N 


3030 


CAG 


TGT 


AGC 


ATG 


ACA- 


AAA 


ATC 


TTA 


GAG 


CCT 


TTT 


AGA 


AAA 


CAA AAT 


3557 


Q 


C 


S 


M 


T 


K 


I 


h 


E 


P 


F 


R 


K 


Q N 


3557 


CAG 


TGC 


TCC 


ATG 


ACC 


AAG 


ATC 


CTG 


GAG 


CCC 


TTC 


CGC 


AAG 


CAG AAC 




3080 
I 


















\ 

3110 

\ 






3077 


P 


D 


I 


V 


I 


Y 


Q 


Y 


M 


D 


D 


L 


Y 


V G 


3077 


CCA 


GAC 


ATA 


GTC 


ATC 


TAT 


CAA 


TAC 


ATG 


GAT 


GAT 


TTG 


TAT 


GTA GGA 


3075 


P 


D . 


I 


V 


I 


Y 


Q 


Y 


M 


D 


D 


L 


Y 


V G 


3075 


CCA 


GAC 


ATA 


GTC 


ATC 


TAT 


CAA 


TAC 


ATG 


GAT 


GAT 


TTG 


TAT 


GTA GGA 


3602 


P 


D 


I 


V 


I 


Y 


Q 


Y 


M 


D 


D 


L 


Y 


V G 


3602 


CCC 


GAC 


ATC 


GTG 


ATC 


TAC 


CAG 

T 


TAC 


ATG 


GAC 


GAC 


CTG 


TAC 


GTG GGC 



3122 
3122 
3120 
3120 
3647 
3647 



3140 

1- 



SDLEI GQHRTKI EEL 
TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

SDLEIGQHRTKIEEL 
TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

SDLEIGQHRTKIEEL 
TCC GAC CTG GAG ATC GGC CAG CAC CGC ACC AAG ATC GAG GAG CTG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpni2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 
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Alignment Report of Codon Optimization (pol).MEG. using Cluslal method with PAM250 residue weight table. 



3257 
3257 
3255 
3255 
3782 



3302 
3302 
3300 
3300 
3827 
3827 



3347 
3347 
3345 
3345 
3872 



3170 
t 



1 

3200 



3167 
3167 
3165 
3165 
3692 
3692 



3212 
3212 
3210 
3210 
3737 
3737 



R Q H L L R W 
AGA CAA CAT CTG TTG AGG TGG 
R Q H L L R W 



GFTTPDKK 
GGA TTT ACC ACA CCA GAC AAA AAA 
GFTTPDKK 



AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 



R 
CGC 



H 



W 



F 



K K 



CAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC GAC AAG AAG 



3230 
I 



HQKEPPFLWMGYELH 
CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 

HQKEPPFLWMGYELH 
CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 

HQKEPPFLWMGYELH 
CAC CAG AAG GAG CCC CCC TTC CTG TGG ATG GGC TAG GAG CTG CAC 



3260 
1 



3290 



PDKWTVQPIVLPEKD 
CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 

PDKWTVQPIVLPEKD 
CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 
PDKWTVQPIVLPEKD 
3782 CCC GAC AAG TGG ACC GTG CAG CCC ATC GTG CTG CCC GAG AAG GAC 



3320 
I 



SWTVNDIQKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVNDIQKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVNDIQKLVGKLN 
TCC TGG ACC GTG AAC GAC ATC CAG AAG CTG GTG GGC AAG CTG AAC 



3350 



3380 



W 
TGG 

W 
TGG 

W 



ASQIYAGIKVRQLC 
GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 



GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4- 3, seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 



K 



3872 TGG GCC TCC CAG ATC TAG GCC GGC ATC AAA GTC CGC CAG CTG TGC 



C pHDMHgpm2 . seq 



3392 
3392 
3390 
3390 
3917 



3410 
I 



KLLRGTKALTEVVPL 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 



K 



K 



V 



V 



AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 



NL4-3 genbank.SEQ 
pNL4-3. seq 



K 



V 



L pHDMHgpm2.seq 



3917 AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG GTG GTG CCC CTG 



Fig. 9E 



wo 00/15819 



PCT/US99/20675 



18/29 

Alignment Report of Codon Optimization (pol).MEG. using Clustal method with PAM250 residue weight table. 



' I 

3440 3470 

1 — — . I 

3437 TEEAELELAENREIL NL4-3 genbank.SEQ 

3437 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3435 TEEAELELAENREIL pNL4-3.seq_ 

3435 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3962 TEEAELELAENRE IL pHDMHgpm2 . seq 

3962 ACC GAG GAG GCC GAG CTG GAG CTG GCC GAG AAC CGC GAG ATC CTG 

3500 

3482 KEPVHGVYY'dPSKDL NL4-3 genbank.SEQ 

34 B 2 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

3480 KEPVHGVY .YDPSKDL pNL4-3.seq 

34 80 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

4007 KEPVHGVYYDPSKDL pHDMHgprri2 . seq 

4007 AAG GAG CCC GTG CAC GGC GTG TAC TAC GAC CCC TCC AAG GAC CTG 

: — : , _ 

3530 3560 



3527 


I 


A 


E 


I 


Q 


K 


Q 


G 


Q 


G 


Q 


W 


T 


Y 


Q 


NL4-3 genbank 


3527 


ATA 


GCA 


GAA 


ATA 


CAG 


AAG 


CAG 


GGG 


CAA 


GGC 


CAA 


TGG 


ACA 


TAT 


CAA 


3525 


I 


A 


E 


I 


Q 


K 


Q 


G 


Q 


G 


Q 


W 


T 


Y 


Q 


pNL4-3.seq 


3525 


ATA 


GCA 


GAA 


ATA 


CAG 


AAG 


CAG 


GGG 


CAA 


GGC 


CAA 


TGG 


ACA 


TAT 


CAA 


4052 


I 


A 


E 


I 


Q 


K 


Q 


G 


Q 


G 


Q 


W 


T 


Y 


Q 


pHDMHgpm2 . seq 


4052 


ATC 


GCC 


GAG 


ATC 


CAG 


AAG 


CAG 


GGC 


CAG 


GGC 


CAG 


TGG 


ACC 


TAC 


CAG 














t 

3590 
• 




















3572 


I 


Y 


Q 


E 


P 


F 


K 


• N 


L 


K 


T 


G 


K 


Y 


A 


NL4-3 genbank 


3572 


ATT 


TAT 


CAA 


GAG 


CCA 


TTT 


AAA 


AAT 


CTG 


AAA 


ACA 


GGA 


AAA 


TAT 


GCA 


3570 


I 


Y 


Q 


E 


P 


F 


K 


N 


L 


K 


T 


G 


K 


Y 


A 


pNL4'3.seq 


3570 


ATT 


TAT 


CAA 


GAG 


CCA 


TTT 


AAA 


AAT 


CTG 


AAA ACA 


GGA 


AAA 


TAT 


GCA 


4097 


I 


Y 


Q 


E 


P 


F 


K 


N 


L 


K 


T 


G 


K 


Y 


A 


pHDMHgpni2 . seq 


4097 


ATC 


TAC 


CAG 


GAG 


CCC 


TTC 


AAG 


AAC 


CTG 


AAG 


ACC 


GGC 


AAA 


TAC 


GCC 






1 

3620 


















i 

3650 










3617 


R 


M 


K 


G 


A 


H 


T 


N 


D 


V 


K 


Q 


L 


T 


E 


NL4-3 genbank . 


3617 


AGA 


ATG 


AAG 


GGT 


GCC 


CAC 


ACT 


AAT 


GAT 


GTG 


AAA 


CAA 


TTA 


ACA 


GAG 




3615 


R 


M 


K 


G 


A 


H 


T 


N 


D 


V 


K 


Q 


L 


T 


E 


pNL4-3. seq 


3615 


AGA 


ATG 


AAG 


GGT 


GCC 


CAC 


ACT 


AAT 


GAT 


GTG 


AAA 


CAA 


TTA ACA 


GAG 


4142 


R 


M 


K 


G 


A 


H 


T 


N 


0 


V 


K 


Q 


L 


T 


E 


pHDMHgpm2 . seq 


4142 


CGC 


ATG 


AAG 


GGC 


GCC 


CAC 


ACC 


AAC 


GAC 


GTG 


AAG 


CAG 


CTG 


ACC 


GAG 
















I 

3680 




















3662 


A 


V 


Q 


K 


I 


A 


T 


E 


S 


I 


V 


I 


W 


G 


K 


NL4-3 genbank. 


3662 


GCA 


GTA 


CAA 


AAA 


ATA 


GCC 


ACA 


GAA 


AGC 


ATA 


GTA 


ATA 


TGG 


GGA AAG 


3660 


A 


V 


Q 


K 


I 


A 


T 


E 


S 


I 


V 


I 


W 


G 


K 


pNL4-3 . seq 


3660 


GCA 


GTA 


CAA 


AAA 


ATA 


GCC 


ACA 


GAA 


AGC 


ATA 


GTA 


ATA 


TGG 


GGA 


AAG 




4187 


A 


V 


Q 


K 


I 


A 


T 


E 


S 


I 


V 


I 


W 


G 


K 


pHDMHgpm2 . seq 


4187 


GCC 


GTG 


CAG 


AAG 


ATC 


GCC 


ACC 


GAG 


TCC 


ATC 


GTG 


ATC 


TGG 


GGC 


AAG 
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Alignment Report of Codon Optimization {pol).MEG. using Clustal method with PAM250 residue weight table. 





3710 


















3740 
1 










3707 


T 


P 


K 


F 


K 


L 


p 


I 


Q 


K 


E T 


W 


E 
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Alignment Report of Codon Optimization (poI).MEG, using Clustal method wrth PAM250 residue weight table. 



3977 
3977 
3975 



3980 
I 



4010 
i 



KTELQAIHLALQDS G~ 
AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

KTELQAIHLALQDSG 
3975 AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 
4502 KTELQAIHLALQDSG 
4502 AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CTG CAA GAC TCC GGC 



4040 



4022 


L E 


V 


N 


I 


V 


T 


D 


S 


Q 


Y 


A 


L 


G 


I 


4 Z 


i i A tjAA 


tslA 




ATA 


GTG 


ACA 


GAC 


TCA 


CAA 


TAT 


GCA 


TTG 


GGA 


ATC 


4020 


L E 




N 


I 


V 


T 


D 


S 


Q 


Y 


A 


L 


G 


I 




TTA GAA 


GTA 


AAC 


ATA 


GTG 


ACA 


GAC 


TCA 


CAA 


TAT 


GCA 


TTG 


GGA 


ATC 


10*1 f 


L £ 


V 


N 


I 


V 


T 


D 


S 


Q 


Y 


A 


L 


G 


I 




CTG GAG 


GTG 


AAC 


ATC 


GTG 


ACC 


GAC 


TCC 


CAG 


TAT 


GCA 


TTG 


GGC 


ATC 




' r- 

4070 
r 


















i 

4100 








4067 


I Q 


a" 


Q 


P 


D 


K 


S 


E 


S 


E 


L 


V 


S 


Q 


4067 


ATT CAA 


GCA 


CAA 


CCA 


GAT 


AAG 


AGT 


GAA 


Tra 

i 


GAG 


TTA 


GTC 


AGT 


CAA 


4065 


I Q 


A 


Q 


P 


D 


K 


S 


E 


S 


E 


L 


V 


S 


Q 


4065 


ATT CAA 


GCA 


CAA 


CCA 


GAT 


AAG 


AGT 


GAA 


TCA 


GAG 


TTA 


GTC 


AGT 


CAA 


4592 


I Q 


A 


Q 


P 


D 


K 


S 


£ 


S 


E 


L 


V 


S 


Q 


4592 


ATC CAG 


GCC 


CAG 


CCC 


GAC 


AAG 


TCC 


GAG 


TCC 


GAG 


CTG 


GTG 


TCC 


CAG 












1 

4130 
I 


















4112 


I I 


E 


Q 


L 


I 


K 


K 


E 


K 


V 


Y 


L 


A 


W 


4112 


ATA ATA 


GAG 


CAG 


TTA 


ATA 


AAA 


AAG 


GAA 


AAA 


GTC 


TAC 


CTG 


GCA 


TGG 


4110 


I I 


E 


Q 


L 


I 


K 


K 


E 


K 


V 


Y 


L 


A 


W 


4110 


ATA ATA 


GAG 


CAG 


TTA 


ATA 


AAA 


AAG 


GAA 


AAA 


GTC 


TAC 


CTG 


GCA 


TGG 


4637 


I I 


E 


Q 


L 


I 


K 


K 


E 


K 


V 


Y 


L 


A 


W 


4637 


ATC ATC 


GAG 


CAG 


CTG 


ATC 


AAG 


AAG 


GAG 


AAG 


GTG 


TAC 


CTG 


GCC 


TGG 




I 

4160 


















1 

4190 








4157 


V P 


A 


H 


K 


G 


I 


G 


G 


N 


£ 


Q 


V 


D 


6 


4157 


GTA CCA 


GCA 


CAC 


AAA 


GGA 


ATT 


GGA 


GGA 


AAT 


GAA 


CAA 


GTA 


GAT 


GGG 


4155 


V P 


A 


H 


K 


G 


I 


G 


G 


N 


E 


Q 


V 


D 


K 


4155 


GTA CCA 


GCA 


CAC 


AAA 


GGA 


ATT 


GGA 


GGA 


AAT 


GAA 


CAA 


GTA 


GAT 


AAG 


4682 


V p 


A 


H 


K 


G 


I 


G 


G 


N 


E 


Q 


V 


D 


K 


4682 


GTG CCC 


GCC 


CAC 


AAG 


GGC 


ATC 


GGC 


GGC 


AAC 


GAG 


CAG 


GTG 


GAC 


AAG 















4220 

L „ 


















4202 


L 


V 


S 


A 


G 


I R 


K 


V 


L 


F 


L 


D 


G 


I 


4202 


TTG 


GTC 


AGT 


GCT 


GGA 


ATC AGG 


AAA 


GTA 


CTA 




TTA 


GAT 


GGA 


ATA 


4200 


L 


V 


5 


A 


G 


I R 


K 


V 


L 


F 


L 


D 


G 


X 


4200 


TTG 


GTC 


AGT 


GCT 


GGA 


ATC AGG 


AAA 


GTA 


CTA 




TTA 


GAT 


GGA 


ATA 


4727 


L 


V 


S 


A 


G 


I R 


K 


V 


L 


F 


L 


D 


G 


I 


4727 


CTG 


GTG 


TCC 


GCC 


GGC 


ATC CGC 


AAG 


GTG 


CTG 


TTC 


CTG 


GAC 


GGC 


ATC 



ML4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpin273eq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.3eq 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



Fig, 9H 



wo 00/15819 



PCTAJS99/20675 



21/29 

Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 
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5267 
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F 1 
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CAC 


CTG 


AAG 


ACC 


GCC 


GTG 


CAG 


ATG 


GCC 


GTG 


TTC 


ATC 


CAC 


AAC 


TTC 



ML4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2. seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpni2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpin2. seq 

NL4-3 geiibank.SEQ 
pNL4-3.seq 
pHDMHgpra2 .seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 .sec 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2.3eq 
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1 1 — 

4790 4820 

' I 

4787 KRKGGIGGYSAGERI NL4-3 genbank.SEQ 

4787 AAA AGA AAA GGG GGG ATT GGG GGG TAG AGT GCA GGG GAA AGA ATA 

4785 KRKGGIGGYSAGERI pNL4-3.seq 

4785 AAA AGA AAA GGG GGG ATT GGG GGG TAG AGT GCA GGG GAA AGA ATA — 

5312 KRKGGIGGYSAGERI pHDMHgpm2 . s eq 

5312 AAG CGC AAG GGC GGC ATC GGC GGC TAG TCC GCC GGC GAG CGC ATC 

I 

4850 

I 

4832 VDIIATDIQTKELQK NL4-3 genbank.SEQ 

4832 GTA GAG ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

4830 VDIIATDIQTKELQK pNL4-3.seq 

4830 GTA GAC ATA ATA GCA ACA GAC ATA CAA "ACT AAA GAA TTA CAA AAA 

5357 VDIIATDIQTKELQK pHDMHgpm2 . seq 

5357 GTG GAC ATC ATC GCC ACC GAC ATC CAG ACC AAG GAG CTG CAG AAG 

r— — ■ i 

4880 4910 

I I 

4877 QITKIQNFRVYYRDS NL4-3 genbank.SEQ 

4877 CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

4875 QITKIQNFRVYYRDS pNL4-3.seq 

4875 CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

5402 QITKIQNFRVYYRDS pHDMHgpm2 . seq 

5402 CAG ATC ACC AAG ATC CAG AAC TTC CGC GTG TAC TAC CGC GAC TCC 

^ 

4940 

4922 RDPVWKGPAKLLWKG NL4-3 genbank.SEQ 

4922 AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

4920 RDPVWKGPAKLLWKG pNL4-3.seq 

4920 AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

5447 RDPVWKGPAKLLWKG pHDMHgpm2 . seq 

5447 CGC GAC CCC GTG TGG AAG GGC CCC GCC AAG CTG CTG TGG AAG GGC 

1 ■ _ . ^ j 

4970 5000 

I — i 

4967 EGAVVI QDNSDIKVV NL4-3 genbank.SEQ 

4967 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

4965 EGAVVIQDNSDIKVV pNL4-3. seq 

4965 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

5492 EGAVVIQDNSDIKVV pHDMHgpm2 . seq 

5492 GAG GGC GCC GTG GTG ATC CAG GAC AAC TCC GAC ATC AAG GTG GTG 

r- 

5030 

5012 PRRKAKIIRDYGKQ.M NL4-3 genbank.SEQ 

5012 CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

5010 PRRKAKIIRDYGKQM pNL4-3.seq 

5010 CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

5537 PRRKAKIIRDYGKQM pHDMHgpm2 . seq 

5537 CCC CGC CGC AAG GCC AAG ATC ATC CGC GAC TAC GGC AAG CAG ATG 
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Alignment Report of Codon Optimization (pcl).MEG. using ClustaJ method with PAM250 residue weight table. 



5057 
5057 
5055 
5055 
5582 
5582 



5060 



5090 



AGDDCVA3RQDED ^ 
GCA GGT GAT GAT TGT GTG GCA AGT AGA GAG GAT GAG GAT TAA. 

AGDDCVASRQDBD ^l^" 
GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT Iaa: 

AGDDCVASRQDBD Sl^^ 
GCC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAG GAC .TOA 



NL4-3 genbaoJc.SEQ 

pNL4-3.3eq 

pHDMHgpm2.3eq 
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AGCTTGGCCC ATTGCATACG TTGTATCCAT ATCATAATAT GTACATTTAT ATTGGCTCAT 
GTCCAACATT ACCGCCATGT TGACATTGAT TATTGACTAG TTATTAATAG TAATCAATTA 
CGGGGTCATT AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG 
GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTGAC GTCAATAATG ACGTATGTTC 
GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA 
GGCAGTACAT CAAGTGTATC ATATGCCAAG TACGCCCCCT ATTGACGTCA 
ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG GACTTTCCTA 
CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT 
GCGTGGATAG CGGTTTGACT CACGGGGATT TCCAAGTCTC CACCCCATTG 
GAC^TTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA 
ATTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC TATATAAGCA 
AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC 
CCGGGACCGA TCCAGCCTCC CCTCGAAGCT GATCCTGAGA ACTTCAGGGT 
GACCCTTGAT GTTTTCTTTC CCCTTCTTTT CTATGGTTAA GTTCATGTCA 
AGAAGTAACA GGGTACACAT ATTGACCAAA TCAGGGTAAT TTTGCATTTG 
AAATGCTTTC TTCTTTTAAT ATACTTTTTT GTTTATCTTA TTTCTAATAC 
CTCTTTCTTT CAGGGCAATA ATGATACAAT GTATCATGCC TCTTTGCACC 
ATAACAGTGA TAATTTCTGG GTTAAGGCAA TAGCAATATT TCTGCATATA 
CATATAAATT GTAACTGATG TAAGAGGTTT CATATTGCTA ATAGCAGCTA 
ACCATTCTGC TTTTATTTTA TGGTTGGGAT 
CCCTTTTGCT AATCATGTTC ATACCTCTTA 
TGGTCTGTGT GCTGGCCCAT CACTTTGGCA 
CCTCCGTGCT GTCCGGCGGC GAGCTGGACA 
GCAAGAAGCA GTACAAGCTG AAGCACATCG 
CCGTGAACCC CGGCCTGCTG GAGACCTCCG 
AGCCCTCCCT GCAAACCGGC TCCGAGGAGC 
TGTACTGCGT GCACCAGCGC ATCGACGTGA 
AGGAGGAGCA GAACAAGTCC AAGAAGAAGG 
ACTCCCAGGT GTCCCAGAAC TACCCCATCG 
AGGCCATCTC CCCCCGCACC CTGAACGCCT 
CCCCCGAAGT CATCCCCATG TTCTCCGCCC 
ACACCATGCT GAACACCGTG GGCGGCCACC 
TCAACGAGGA GGCCGCCGAG TGGGACCGCC 
CCGGCCAGAT GCGCGAGCCC CGCGGCTCCG 
AGCAGATCGG CTGGATGACC CACAACCCCC 
GGATCATCCT GGGCCTGAAC AAGATCGTGC 
TCCGCCAGGG CCCCAAGGAG CCCTTCCGCG 
GCGCCGAGCA GGCCTCCCAG GAGGTAAAGA 
ACGCCAACCC CGACTGCAAG ACCATCCTGA 
ACCCTGGAGG AGATGATGAC CGCCTGCCAG GGCGTGGGCG 
GTGCTGGCCG AGGCCATGTC CCAAGTCACC AACCCCGCCA 
AACTTCCGCA ACCAGCGCAA GACCGTGAAG TGCTTCAACT 



CCATAGTAAC 
CTGCCCACTT 
ATGACGGTAA 
CTTGGCAGTA 
ACATCAATGG 
ACGTCAATGG 
ACTCCGCCCC 
GAGCTCGTTT 
ATAGAAGACA 
GAGTCTATGG 
TAGGAAGGGG 
TAATTTTAAA 
TTTCCCTAAI 
ATTCTAAAGA 
AATATTTCTG 
CAATCCAGCT 
CCAAGCTAGG 
GGCAACGTGC 
GGCGCCCGCG 
CGCCCCGGCG 
GAGCGCTTCG 
GGCCAGCTGC 
ATCGCCGTGC 
GACAAGATCG 
ACCGGCAACA 
ATGGTGCACC 
AAGGCCTTCT 
CAGGACCTGA 
AAGGAGACCA 
CCCATCGCCC 
ACCCTGCAAG 
TACAAGCGCT 
ATCCTGGACA 
AAGACCCTGC 
CTGGTGCAGA 



AAGGCTGGAT TATTCTGAGT 
TCTTCCTCCC ACAGCTCCTG 
AAGAATTCTA GACTGCCATG 
AGTGGGAGAA GATCCGCCTG 
TGTGGGCCTC CCGCGAGCTG 
AGGGCTGCCG CCAGATCCTG 
TGCGCTCCCT GTACAACACC 
AGGACACCAA GGAGGCCCTG 
CCCAGCAGGC CGCCGCCGAC 
TGCAGAACCT GCAGGGCCAG 
GGGTGAAGGT GGTGGAGGAG 
TGTCCGAGGG CGCCACCCCC 
AGGCCGCCAT GCAGATGCTG 
TGCACCCCGT GCACGCCGGC 
ACATCGCCGG CACCACCTCC 
CCATCCCCGT GGGCGAGATC 
GCATGTACTC CCCCACCTCC 
ACTACGTGGA CCGCTTCTAC 
ACTGGATGAC CGAGACCCTG 
AGGCCCTGGG CCCCGGCGCC 
GCCCCGGCCA CAAGGCCCGC 
CCATCATGAT CCAGAAGGGC 
GCGGCAAGGA GGGCCACATC 
AGTGCGGCAA GGAGGGCCAC 



rrCAAGAACT GCCGCGCCCC CCGCAAGAAG GGCTGCTGGA 

SSgSJg attgtactga gagacaggct aattttttag ggaagatctg gccttcccac 
SgggSSc Sgggaattt tcttcagagc agaccagagc caacagcccc accagaagag 
^c??Sgg^ Sggggaaga gacaacaact ccctctcaga agcaggagcc gatagacaag 
SSSatc ctttagcttc cctcagatca ctctttggca gcgacccctc gtcacaataa 
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300 
360- 
420 
480 
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660 
720 
780 
840 
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960 
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1080 
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1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
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1800 
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AGATCGGTGG CCAGCTGAAG GAGGCCCTGC TGGACACCGG CGCCGACGAC ACCGTGCTGG 2880 

AGGAGATGAA CCTGCCCGGC CGCTGGAAGC CCAAGATGAT CGGCGGCATC GGCGGCTTCA 2940 

TCAAAGTCCG CCAGTACGAC CAGATCCTGA TCGAGATCTG CGGCCACAAG GCCATCGGCA 3000 

CCGTGCTGGT GGGCCCCACC CCCGTGAACA TCATCGGCCG CAACCTGCTG ACCCAGATCG 3060 

GCTGCACCCT GAACTTCCCC ATCTCCCCCA TCGAGACCGT GCCCGTGAAG CTGAAGCCCG 3120 

GCATGGACGG CCCCAAAGTC AAGCAGTGGC CCCTGACCGA GGAGAAGATC AAGGCCCTGG 3180 

TCGAGATCTG CACCGAGATG GAGAAGGAGG GCAAGATCTC CAAGATCGGC CCCGAGAACC 3240"" 

CCTACAACAC CCCCGTGTTC GCCATCAAGA AGAAGGACTC CACCAAGTGG CGCAAGCTGG 33 00 

TGGACTTCCG CGAGCTGAAC AAGCGCACCC AGGACTTCTG GGAGGTGCAG CTGGGCATCC 3360 

CCCACCCCGC CGGCCTGAAG CAGAAGAAGT CCGTGACCGT GCTGGACGTG GGCGACGCCT 3420 

ACTTCTCCGT GCCCCTGGAC AAGGACTTCC GCAAGTACAC CGCCTTCACC ATCCCCTCCA 3480 

TCAACAACGA GACCCCCGGC ATCCGCTACC AGTACAACGT GCTGCCCCAG GGCTGGAAGG 3540 

GCTCCCCCGC CATCTTCCAG TGCTCCATGA CCAAGATCCT GGAGCCCTTC CGCAAGCAGA 3600 

ACCCCGACAT CGTGATCTAC CAGTACATGG ACGACCTGTA CGTGGGCTCC GACCTGGAGA 3 660 

TCGGCCAGCA CCGCACCAAG ATCGAGGAGC TGCGCCAGCA CCTGCTGCGC TGGGGCTTCA 3720 

CCACCCCCGA CAAGAAGCAC CAGAAGGAGC CCCCCTTCCT GTGGATGGGC TACGAGCTGC 3780 

ACCCCGACAA GTGGACCGTG CAGCCCATCG TGCTGCCCGA GAAGGACTCC TGGACCGTGA 3840 

ACGACATCCA GAAGCTGGTG GGCAAGCTGA ACTGGGCCTC CCAGATCTAC GCCGGCATCA 3900 

AAGTCCGCCA GCTGTGCAAG CTGCTGCGCG GCACCAAGGC CCTGACCGAG GTGGTGCCCC 3960 

TGACCGAGGA GGCCGAGCTG GAGCTGGCCG AGAACCGCGA GATCCTGAAG GAGCCCGTGC 4020 

ACGGCGTGTA CTACGACCCC TCCAAGGACC TGATCGCCGA GATCCAGAAG CAGGGCCAGG 4080 

GCCAGTGGAC CTACCAGATC TACCAGGAGC CCTTCAAGAA CCTGAAGACC GGCAAATACG 4140 

CCCGCATGAA GGGCGCCCAC ACCAACGACG TGAAGCAGCT GACCGAGGCC GTGCAGAAGA 4200 

TCGCCACCGA GTCCATCGTG ATCTGGGGCA AGACTCCCAA GTTCAAGCTG CCCATCCAGA 4260 

^AGGAGACCTG GGAGGCCTGG TGGACCGAGT ACTGGCAGGC CACCTGGATC CCCGAGTGGG 4320 

AGTTCGTGAA CACCCCCCCC CTGGTGAAGC TGTGGTACCA GCTGGAGAAG GAGCCCATCA 4380 

TCGGCGCCGA GACCTTCTAC GTGGACGGCG CCGCCAACCG CGAGACCAAG CTGGGCAAGG 4440 

CCGGCTACGT GACCGACCGC GGCCGCCAGA AGGTGGTGCC CCTGACCGAG ACCACCAACC 4500 

AGAAGACCGA GCTGCAGGCC ATCCACCTGG CCCTGCAAGA CTCCGGCCTG GAGGTGAACA 4560 

TCGTGACCGA CTCCCAGTAT GCATTGGGCA TCATCCAGGC CCAGCCCGAC AAGTCCGAGT 4620 

CCGAGCTGGT GTCCCAGATC ATCGAGCAGC TGATCAAGAA GGAGAAGGTG TACCTGGCCT 4680 

GGGTGCCCGC CCACAAGGGC ATCGGCGGCA ACGAGCAGGT GGACAAGCTG GTGTCCGCCG 4740 

GCATCCGCAA GGTGCTGTTC CTGGACGGCA TCGACAAGGC CCAGGAGGAG CACGAGAAGT 4800 

ACCACTCCAA CTGGCGCGCC ATGGCCTCCG ACTTCAACCT GCCCCCCGTG GTGGCCAAGG 4860 

AGATCGTGGC CTCCTGCGAC AAGTGCCAGC TGAAGGGCGA GGCCATGCAC GGCCAGGTGG 4920 

ACTGCTCCCC CGGCATCTGG CAGCTGGACT GCACCCACCT GGAGGGCAAG GTGATCCTGG 4980 

TGGCCGTGCA CGTGGGCTCC GGCTACATCG AGGCCGAGGT GATCCCCGCC GAGACCGGCC 5040 

AGGAGACCGC CTACTTCCTG CTGAAGCTGG CCGGCCGCTG GCCCGTGAAG ACCGTGCACA 5100 

CCGACAACGG CTCCAACTTC ACCTCCACCA CCGTGAAGGC CGCCTGCTGG TGGGCCGGCA 5160 

TCAAGCAGGA GTTCGGCATC CCCTACAACC CCCAGTCCCA GGGCGTGATC GAGTCCATGA 5220 

ACAAGGAGCT GAAGAAGATC ATCGGCCAAG TCCGCGACCA GGCCGAGCAC CTGAAGACCG 5280 

CCGTGCAGAT GGCCGTGTTC ATCCACAACT TCAAGCGCAA GGGCGGCATC GGCGGCTACT 5340 

CCGCCGGCGA GCGCATCGTG GACATCATCG CCACCGACAT CGAGACCAAG GAGCTGCAGA 5400 

AGCAGATCAC CAAGATCCAG AACTTCCGCG TGTACTACCG CGACTCCCGC GACCCCGTGT 5460 

GGAAGGGCCC CGCCAAGCTG CTGTGGAAGG GCGAGGGCGC CGTGGTGATC CAGGACAACT 5520 

CCGACATCAA GGTGGTGCCC CGCCGCAAGG CCAAGATCAT CCGCGACTAC GGCAAGCAGA 5580 

TGGCCGGCGA CGACTGCGTG GCCTCCCGCC AGGACGAGGA CTAACACATG GAAAA6ATTA 5640 
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GTAAAACACC 
CCCAGATCTA 
CTAATGCCCT 
AGGTTCCTTT 
CATCTGGATT 
TTCTGAATAT 
TGAAGAGCTA 
GAGGCTGCAA 
CCTCAGAAAA 
TTTACATTAC 
TCCTGCATCT 
CTGAAGTGTT 
TTAGTTGTCT 
TTGACTGTCC 
CGACATGGCA 
TAGGTTAATG 
GTGCGCGGAA 
AGACAATAAC 
CATTTCCGTG 
CCAGAAACGC 
ATCGAACTGG 
CCAATGATGA 
GGGCAAGAGC 
CCAGTCACAG 
ATAACCATGA 
GAGCTAACCG 
CCGGAGCTGA 
GCAACAACGT 
TTAATAGACT 
GCTGGCTGGT 
GCAGCACTGG 
CAGGCAACTA 
CATTGGTAAC 
TTTTAATTTA 
TAACGTGAGT 
TGAGATCCTT 
GCGGTGGTTT 
AGCAGAGCGC 
AAGAACTCTG 
GCCAGTGGCG 
GCGCAGCGGT 
TACACCGAAC 
AGAAAGGCGG 
CTTCCAGGGG 
GAGCGTCGAT 
GGATGCGCCG 
TGGTTTGCGC 



ATAGGCCGCT 
ATTCACCCCA 
GGCCCACAAG 
GTTCCCTAAG 
CTGCCTAATA 
TTTACTAAAA 
GTTCAAACCT 
ACAGCTAATG 
GGATTCAAGT 
TTATTGTTTT 
CTCAGCCTTG 
CCTTCCATGT 
CTGTTGTCTT 
TGTGAGCCCT 
GTCTAGATCA 
TCATGATAAT 
CCCCTATTTG 
CCTGATAAAT 
TCGCCCTTAT 
TGGTGAAAGT 
ATCTCAACAG 
GCACTTTTAA 
AAGTCGGTCG 
AAAAGCATCT 
GTGATAACAC 
CTTTTTTGCA 
ATGAAGCCAT 
TGCGCAAACT 
GGATGGAGGC 
TTATTGCTGA 
GGCCAGATGG 
TGGATGAACG 
TGTCAGACCA 
AAAGGATCTA 
TTTCGTTCCA 
TTTTTCTGCG 
GTTTGCCGGA 
AGATACCAAA 
TAGCACCGCC 
ATAAGTCGTG 
CGGGCTGAAC 
TGAGATACCT 
ACAGGTATCC 
GAAACGCCTG 
TTTTGTGATG 
CGTGCGGCTG 
ATTCACAGTT 



CTAGAGGATC 
CCAGTGCAGG 
TATCACTAAG 
TCCAACTACT 
AAAAACATTT 
AGGGAATGTG 
TGG6AAAATA 
CACATTGGCA 
AGAGGCTTGA 
AGCTGTCCTC 
ACTCCACTCA 
TTTACGGCGA 
ATAGAGGTCT 
TCTTCCCTGC 
TTCTTGAAGA 
AATGGTTTCT 
TTTATTTTTC 
GCTTCAATAA 
TCCCTTTTTT 
AAAAGATGCT 
CGGTAAGATC 
AGTTCTGCTA 
CCGCATACAC 
TACGGATGGC 
TGCGGCCAAC 
CAACATGGGG 
ACCAAACGAC 
ATTAACTGGC 
GGATAAAGTT 
TAAATCTGGA 
TAAGCCCTCC 
AAATAGACAG 
AGTTTACTCA 
GGTGAAGATC 
CTGAGCGTCA 
CGTAATCTGC 
TCAAGAGCTA 
TACTGTTCTT 
TACATACCTC 
TCTTACCGGG 
GGGGGGTTCG 
ACAGCGTGAG 
GGTAAGCGGC 
GTATCTTTAT 
CTCGTCAGGG 
CTGGAGATGG 
CTCCGCAAGA 



CAAGCTTATC 
CTGCCTATCA 
CTCGCTTTCT 
AAACTGGGGG 
ATTTTCATTG 
GGAGGTCAGT 
CACTATATCT 
ACAGCCCCTG 
TTTGGAGGTT 
ATGAATGTCT 
GTTCTCTTGC 
GATGGTTTCT 
ACTTGAAGAA 
CTCCCCCACT 
CGAAAGGGCC 
TAGACGTCAG 
TAAATACATT 
TATTGAAAAA 
GCGGCATTTT 
GAAGATCAGT 
CTTGAGAGTT 
TGTGGCGCGG 
TATTCTCAGA 
ATGACAGTAA 
TTACTTCTGA 
GATCATGTAA 
GAGCGTGACA 
GAACTACTTA 
GCAGGACCAC 
GCCGGTGAGC 
CGTATCGTAG 
ATCGCTGAGA 
TATATACTTT 
CTTTTTGATA 
GACCCCGTAG 
TGCTTGCAAA 
CCAACTCTTT 
CTAGTGTAGC 
GCTCTGCTAA 
TTGGACTCAA 
TGCACACAGC 
CTATGAGAAA 
AGGGTCGGAA 
AGTCCTGTCG 
GGGCGGAGCC 
CGGACGCGAT 
ATTGATTGGC 



GATACCGTCG 
GAAAGTGGTG 
TGCTGTCCAA 
ATATTATGAA 
CAATGATGTA 
GCATTTAAAA 
TAAACTCCAT 
ATGCCTATGC 
AAAGTTTTGC 
TTTCACTACC 
TTAGAGATAC 
CCTCGCCTGG 
GGAAAAACAG 
CACAGTGACC 
TCGTGATACG 
GTGGCACTTT 
CAAATATGTA 
GGAAGAGTAT 
GCCTTCCTGT 
TGGGTGCACG 
TTCGCCCCGA 
TATTATCCCG 
ATGACTTGGT 
GAGAATTATG 
CAACGATCGG 
CTCGCCTTGA 
CCACGATGCC 
CTCTAGCTTC 
TTCTGCGCTC 
GTGGGTCTCG 
TTATCTACAC 
TAGGTGCCTC 
AGATTGATTT 
ATCTCATGAC 
AAAAGATCAA 
CAAAAAAACC 
TTCCGAAGGT 
CGTAGTTAGG 
TCCTGTTACC 
GACGATAGTT 
CCAGCTTGGA 
GCGCCACGCT 
CAGGAGAGCG 
GGTTTCGCCA 
TATGGAAAAA 
GGATATGTTC 
TCCAATTCTT 



ACCTCGAGGG 
GCTGGTGTGG 
TTTCTATTAA 
GGGCCTTGAG 
TTTAAATTAT 
CATAAAGAAA 
GAAAGAAGGT 
CTTATTCATC 
TATGCTGTAT 
CATTTGCTTA 
CACCTTTCCC 
CCACTCAGCC 
GGGGCATGGT 
CGGAATCCCT 
CCTATTTTTA 
TCGGGGAAAT 
TCCGCTCATG 
GAGTATTCAA 
TTTTGCTCAC 
AGTGGGTTAC 
AGAACGTTTT 
TATTGACGCC 
TGAGTACTCA 
CAGTGCTGCC 
AGGACCGAAG 
TCGTTGGGAA 
TGTAGCAATG 
CCGGCAACAA 
GGCCCTTCCG 
CGGTATCATT 
GACGGGGAGT 
ACTGATTAAG 
AAAACTTCAT 
CAAAATCCCT 
AGGATCTTCT 
ACCGCTACCA 
AACTGGCTTC 
CCACCACTTC 
AGTGGCTGCT 
ACCGGATAAG 
GCGAACGACC 
TCCCGAAGGG 
CACGAGGGAG 
CCTCTGACTT 
CGCCAGCAAC 
TGCCAAGGGT 
GGAGTGGTGA 
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ATCCGTTAGC GAGGTGCCGC CGGCTTCCAT TCAGGTCGAG GTGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGGGGAGGC AGACAAGGTA TAGGGCGGCG CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCCGAGG CGGCATAAAT CCCCGTGACG ATCAGCGGTC CAATGATCGA 8640 

AGTTAGGCTG GTAAGAGCCG CGAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATCTAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGGG CATCCCGATG CCGCCGGAAG CGAGAAGAAT 8760 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCTAGGCCTC 8820 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCAGAGGC CGAGGCGGCC TCGGCCTCTG 8880 
CATAAATAAA AAAAATTAGT CAGCCATG 8908 
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